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[57] ABSTRACT 

Described is a mechanism for dependably tracking web page 
activities among a group of browsers. The web browsers 
retrieve web pages from an HTTP server, with each of the 
web pages embedding an applet. In response to web page 
activities (such as loading or unloading of a web page) 
performed at a browser, the respective applet reports the 
activities (together with the URL of the web page) to a 
synchronization server, which in turn stores them in a 
database. 
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METHOD FOR MONITORING USER 
INTERACTIONS WITH WEB PAGES FROM 
WEB SERVER USING DATA AND 
COMMAND LISTS FOR MAINTAINING 
INFORMATION VISITED AND ISSUED BY 
PARTICIPANTS 

RELATED APPLICATIONS 

This application relates to five other applications: (1) U.S. 
Ser. No. 08/944,757 filed Oct. 6, 1997, entitled "DEPEND- 
ABLE DATA ELEMENT SYNCHRONIZATION MECHA- 
NISM"; (2) U.S. Ser. No. 08/944,951 filed Oct. 6, 1997, 
entitled "DEPENDABLE DATA ELEMENT TRACKING 
MECHANISM"; (3) U.S. Ser. No. 08/944,121 filed Oct. 6, 
1997, entitled "DEPENDABLE WEB PAGE SYNCHRO- 
NIZATION MECHANISM"; (4) U.S. Ser. No. 08/944,125 
filed Oct. 6, 1997, entitled "MECHANISM FOR DEPEND- 
ABLY MANAGING WEB SYNCHRONIZATION AND 
TRACKING OPERATIONS AMONG MULTIPLE 
BROWSERS"; and (5) U.S. Ser. No. 08/944,124 filed Oct. 
6, 1997, entitled "MECHANISM FOR DEPENDABLY 
ORGANIZING AND MANAGING INFORMATION FOR 
WEB SYNCHRONIZATION AND TRACKING AMONG 
MULTIPLE BROWSERS". 

BACKGROUND OF THE INVENTION 

The present invention relates generally to a method and 
apparatus for coordinating access to Internet web sites by a 
group of web browsers that are being run at a group of user 
terminals. 

It is known that users can retrieve information from web 
sites (network sites) via the Internet. The basic model for 
retrieving information from web sites is user initiated infor- 
mation searching. Specifically, a user interacts with (via a 
terminal) a web browser to send a request to a web site. In 
response to the request, the web server for the web site 
retrieves the information requested and sends the web 
browser the information arranged in so called web page 
(HTML) format. One of the unique features of this model is 
the feature of "hyper-text links" embedded in web pages that 
have been retrieved. This feature enables a user in searching 
for information to "navigate" from one web page to another. 
In order to provide services (or assistance) to users (or 
customers) via the Internet, it is desirable to provide a 
mechanism to track activities performed to the web pages 
among a group of browsers. 

One method of tracking web pages navigated is to install 
a monitoring program at a web site. When a terminal sends 
requests to a web site, the monitoring program at the web 
side collects the URLs for the requested web pages and 
sends the URLs to a server. However, under this method, the 
monitoring program is not always able to monitor the 
requests from the terminal, because when the terminal 
retrieves web pages from its browser cache space or from a 
proxy server, the requests are fulfilled locally and are never 
sent to the web site. As a result, the URLs are not accurately 
tracked. 

Another method of tracking web pages navigated is to 
install a monitoring program together with a browser at a 
terminal. The monitoring program constantly communicates 
with the web browser. When the browser sends requests out, 
the monitoring program collects the URLs for the web pages 
requested by the browser and sends the URLs to a server. 
However, this method requires designing and installing 
monitoring programs that are capable of communicating 
with the different browsers. At the current time, different 
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web browsers are manufactured by a variety of vendors, 
including: Netscape®, Microsoft®, Sun Microsystem, IBM, 
and others. For a programmer to design such a monitoring 
program, it requires him/her to know the details of a 

5 proprietary web browser, and it may require updating the 
monitoring program whenever a proprietary web browser is 
updated. In addition, a monitoring program designed for a 
web browser manufactured by one vendor is typically not 
portable to another web browser manufactured by another 

10 vendor because browser interface mechanisms are propri- 
etary. Moreover, users may perceive it as intrusive to be 
required to install a specialized application capable of col- 
lecting and reporting the information about the web pages 
retrieved from all other web sites. 

15 Therefore, there is a need for an improved method to 
provide more dependable web page tracking. 

There is another need for an improved method to provide 
web page tracking without requiring knowledge of the 
details about the web navigation software. 

20 There is yet another need for an improved method to 
design web page tracking software that is portable to dif- 
ferent software environments. 
The present invention meets these needs. 

25 SUMMARY OF THE INVENTION 

In one aspect, the invention provides a method for track- 
ing interactions with pages that have been loaded from a web 
server to a terminal, and for storing information about the 
3 q interactions to a page tracking server. The method comprises 
the steps of: 

(a) loading a first page from the web server, the first page 
being associated with a page locator for indicating a 
location of the first page in the web server, and the first 

35 page containing location information for indicating a 
location of a program; 

(b) loading the program from the web server based on the 
location information, and executing the program; 

(c) the program monitoring interactions with the page; 
40 and 

(d) the program sending information about the interac- 
tions to the page tracking server. 

In another aspect, the invention provides a method for 
tracking interactions with pages that have been loaded from 
45 a web server to a terminal, and for storing information about 
the interactions to a page tracking server. The method 
comprises the steps of: 

(a) loading a page from the web server, the page being 
associated with a page locator for indicating a location 

50 of the page in the web server, and the page containing 
a program; 

(b) executing the program; 

(c) the program monitoring interactions with the page; 
55 and 

(d) the program sending information about the interac- 
tions to the page tracking server. 

The present invention also provides corresponding system 
for the respective aspects mentioned above. 

60 BRIEF DESCRIPTION OF THE DRAWINGS 

The purpose and advantages of the present invention will 
be apparent to those skilled in the art from the following 
detailed description in conjunction with the appended 
65 drawing, in which: 

FIG. 1 shows a system includes N terminals, a network, 
and a web site, in accordance with the present invention; 
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FIG. 2 shows a situation where each of the N terminals is provided in the context of a particular application and its 

has downloaded its respective Master Applets, DTS Applets, requirements. Various modifications to the preferred 

and SessionID Applet, in accordance with the present tnvcn- embodiment(s) will be readily apparent to those skilled in 

tion; the art, and the principles defined herein may be applied to 

FIG. 3 shows the process the (consumer) Master Applet, 5 other embodiments and applications without departing from 

DTS Applets, and SessionID Applet being downloaded into the spirit and scope of the invention. Thus, the present 

a terminal, in accordance with the present invention; invention is not intended to be limited to the embodiments) 

FIG. 4 shows the process the (consumer) Master Applet, shown, but is to be accorded with the broadest scope 

DTS Applets, and SessionID Applet being invoked, in consistent with the principles and features disclosed herein, 

response to loading a subsequent web page, to perform the 10 Rctcnin to FIG . h thcre k shown aD exemplary web 

operations in accordance with the present invention, when c • *• inn * -j -.u .u 

*: *, t . . , , 1 j j j «i. j page synchronization system 100, in accordance with the 

these Applets have been previously downloaded and cached v & J . . J 

. • , r 1 present invention, 

in a terminal; r 

FIG. 5 shows the process of the (consumer) Master As shown * m FIG - x > the system includes N terminals 

Applet, DTS Applets, and SessionID Applet being invoked, (104 A, . . . , 104K, . . . , and 104N), a network 129 (the 

in response to loading a subsequent web page, to perform the Internet, or a combination of the Internet and an Intranet), 

operations in accordance with the present invention, when and a web site 134. Each of the terminals has a telephone set 

both these Applets and the web page have been previously (102 A, . . . , 102K, . . . , or 102N) installed in its vicinity, 

downloaded and cached in a terminal; Each of the terminals can be a PC computer, a workstation, 

FIG. 6 shows a session table in greater detail, in accor- a Java station, or even a web TV system, 

dance with the present invention; 20 Web site 134 includes a WTS (Web Tracking and 

FIG. 7 shows how an agent (or supervisor) can create a Synching) gateway 142, a WTS server 144 containing a 

session interface by downloading an agent page (or a session table 145 and a user class table 147, a database 

supervisor page) from administration page repository 149, in processing application 148, an HTTP (Hyper Text Transfer 

accordance with the present invention. Protocol) server 152, and a hard disk unit 154 for storing 

FIG. 8A shows an agent session interface, in accordance 2 consumer page repository 146, administration page reposi- 

with the present invention; tory 14^ an d database 156. All the components in web site 

FIG. 8B shows a browser supervisor session interface, in 134 can be installed in one or more computer systems. Each 

accordance with the current invention; G f tne computer systems includes a processing unit (which 

FIG. 8C shows a supervisor agent session interface, in mav include a plurality of processors), a memory device, 

accordance with the present invention; 30 an d a disk unit (which may include a plurality of disk sets). 

FIG. 9 shows a flowchart illustrating the operation of Each of the ^m^ais (104A, . . . , 104K, . . . , or 104N) 

joining a session by an agent, in accordance with the present includes a processor unit (not shown) and a memory area 

invention; • • ■ , U5K, . . . , 115N), and runs a Java enabled web 

FIG. 10 shows a screen display containing two browse browser ( U 4A, . . . , 114K, . . . , or 114N). Each of the 

instances, in accordance with the present invention; memory area (115A, . . . , 115K, . . . , or 115N) is maintained 

FIG. 11 shows a flowchart illustrating the operation of t> y j te respective browser (114A, . . . , 114K, . . . , or U4N). 

web page synchronization, in accordance with the present via network 129, each of the browsers (U4A, . . . , 

invention; 114K, . . . , or 114N) is able to send requests to and receive 

FIG. 12A shows a web page containing five data fields, in 4Q W eb pages from HTTP server 152, and to display the web 
accordance with the present invention. FIG. 12B shows a pages received at its respective terminal. Each of the brows- 
web page that is similar to that of FIG. 12A, except that the e rs (114 A • ■ • , 114K, . . . , or 114N) is able to run a Master 
data in one of the five data fields is changed, in accordance Applet (124A, , . . , 124K, . . . , or 124N), a set of DTS (Data 
with the present invention; Tracking and Synching) Applets, a SessionID Applet, and an 

FIG. 13 shows a flowchart illustrating the operation of 45 Agent Applet. As shown in FIG. 1, these Applets are stored 

data synchronization, in accordance with the present inven- in consumer page repository 146 and can be downloaded 

tion; from consumer page repository 146 and stored in the 

FIG. 14 shows a flowchart illustrating the operation of memory areas of the terminals (104A, . . . , 104K, 

web page tracking, in accordance with the present invention; 104N). 

FIG. 15 shows a flowchart illustrating the operation of 50 Referring to FIG. 2, there is shown the situation where 

data tracking, in accordance with the present invention; each of the terminals (104A, . . . , 104K, . . . , or 104N) has 

FIG. 16 shows a flowchart illustrating the operation of downloaded its respective Master Applets (124A, . . . , 

joining a session by a supervisor, in accordance with the 124K, . . . , or 124N), DTS Applets (126A, . . . , 126K, . . . , 

present invention. or 128N), and SessionID Applet (128A, . . . , 128K, . . . , 

FIG. 17, there shows three browser instances for a 55 128N), in accordance with the present invention, 

supervisor, in accordance with the present invention; In FIG. 2, each of the (consumer) Master Applet 

FIG. 18 shows a flowchart illustrating the operation of (124A, . . . , 124K, . . . , or 124N) is primarily responsible 

re-browsing a web page previously viewed in a session, in for: (1) in response to loading each web page at its respective 

accordance with the present invention; and browser, opening a dedicated socket, and establishing a 

FIG. 19 shows a flowchart illustrating the operation of 60 socket connection to WTS gateway 142 via network 129 for 

re-browsing all web pages previously viewed in a session, in * ls respective browser (114 A, . . . , 11 4K, . . . , 114N), (2) 

accordance with the present invention. communicating with WTS server 144 via the socket 

connection, from which WTS server 144 is able identify the 

DETAILED DESCRIPTION OF THE origin (i e< which browser, which web page, etc.) of the 

PREFERRED EMBODIMENTS 65 comman ds and information that are being delivered through, 

The following description is presented to enable any (3) monitoring the activities of its respective browser, (4) 

person skilled in the art to make and use the invention, and sending the information about its respective browser's 
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activities to WTS server 144, (5) receiving and processing 
the information about other browsers' activities, (6) via the 
socket connection, providing a single communication path 
to WTS server 144 for DTS Applets (126A, . . . , 126K, 
or 126N), SessionID Applets (128A, . . . , 128K, . . . , or 5 
128N), or any other consumer Applets embedded on the 
same page with the Master Applet, (7) sending commands to 
WTS server 144 to request services, for itself and for DTS 
Applets (126A, . . . , 126K, .... or 126N), SessionID Applets 
(128A, . . . , 128K, . . . , or 128N) , or any other consumer 10 
Applets embedded on the same page with the Master Applet, 
and (8) sending user class information together with the 
commands, to indicate that its respective browser is a 
consumer user. 

Each set of DTS Applets (126A, .... 126K, . . . , or 126N) ^ 
contains one or more individual DTS Applets, which are 
primarily responsible for (1) displaying and monitoring the 
data activities (data inputs or data updates of data fields) on 
web pages that are being displayed by its respective browser, 
(2) sending the data activities to WTS server 144 via its 20 
respective Master Applet, (3) receiving the data activities 
from other browsers via its respective Master Applet, and (4) 
processing the data activities from other browsers for the 
web pages that are being displayed by its respective browser. 

Each of the SessionID Applets (128A, . . . , 128K, . . . , 25 
or 128N) is responsible for retrieving, and for displaying on 
a web page the current SessionID. 

As shown in administration page repository 149, Agent 
Applet (or Supervisor Applet) is responsible for creating a 3Q 
session interface, joining, monitoring, and controlling a 
session through the session interface. The( administration) 
Master Applet is primarily responsible for: (1) opening a 
dedicated socket, and establishing a socket connection to 
WTS gateway 142 via network 129 for the session interface 35 
created by Agent Applet, Supervisor Applet, or any other 
administration Applets embedded on the same web page 
with the Master Applet, (2) communicating with WTS server 
144 via the socket connection, from which WTS server 144 
is able identify the origin (i.e. from which session interface) 4Q 
of the commands and information that are being delivered 
through, (3) via the socket connection, providing a single 
communication path to WTS server 144 for Agent Applet, 
Supervisor Applet, or any other administration Applets 
embedded on the same web page with the Master Applet, 45 
and (4) sending user class information together with the 
commands, to indicate that its respective browser is an 
administration user. 

WTS gateway 142 is responsible for maintaining all 
socket connections between Master Applets and WTS server 50 
144. The connections between Master Applets and WTS 
gateway 142 take place using standard sockets. The con- 
nection between WTS gateway 142 and WTS server 144 
takes place using RMI (Remote Method Invocation). 

WTS server 144 is responsible for: (1) managing and 55 
tracking the activities of all browsers participating in active 
sessions, exemplary activities including: loading of, inter- 
acting with, and unloading of web pages, (2) recording the 
information about the activities, (3) managing the synchro- 
nization of the activities for all browsers participating in the 60 
active sessions, (4) creating a session when a consumer user 
(via a browser) sends a request to web site 134 for the first 
time, (5) denning the session length intervals, (6) purging 
sessions that have been inactive for more than the specified 
session length intervals, (7) adding and deleting participants 65 
to a session, and (8) providing services to all commands 
from consumer Applets, such as: (consumer) Master Applet, 
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DTS Applets, SessionID Applets, and administration 
Applets, such as: (administration) Master Applet, Agent 
Applets, and Supervisor Applet. 

Consumer page repository 146 stores the web pages and 
Applets for consumers. Consumer Applets can be selectively 
embedded into consumer web pages. Exemplary consumer 
Applets include: (consumer) Master Applet, DTS Applets, 
SessionID Applet, etc. 

Administration page repository 149 stores the web pages 
and Applets for call center administration users, including: 
administrator, supervisor, agent, etc. Administration Applets 
can be selectively embedded into administration web pages. 
Exemplary administration Applets include: (administration) 
Master Applet, Agent Applet, Supervisor Applet, etc. 

To better describe the present invention, the Applets 
stored in (or downloaded from) repository 146 can be 
referred to as consumer Applets, and the Applets stored in 
(or downloaded from) repository 149 can be referred to as 
administration Applets. For example, the Master Applet 
stored in (or downloaded from) repository 146 can be 
referred to as consumer Master Applet, and the Master 
Applet stored in (or downloaded from) repository 149 can be 
referred to as administration Master Applets. HTTP server 
152 contains a security application that allows consumer 
users to get access only to the web pages stored in consumer 
page repository 146, and allows administration users (such 
as administrator, supervisor, agent, etc.) to get access to the 
web pages stored in both consumer page repository 146 and 
administration page repository 149. 

Session table 145 is responsible for maintaining the 
information for all active sessions. 

Class table 147 is responsible for keeping records of user 
classes assigned to different users. Listed are exemplary user 
classes: administrator, supervisor, agent, and consumer. 

Based on user classes (administrator, supervisor, agent, 
and consumer), WTS server 144 provides the following 
services: 

(1) creating a session (consumer); 

(2) storing data received from a session participant 
(supervisor, agent, and consumer); 

(3) listing active sessions (administrator and supervisor); 

(4) listing the information associated with active sessions 
(administrator, and supervisor); 

(5) listing current users (administrator); 

(6) joining a session (supervisor and agent); 

(7) terminating a session (supervisor); 

(8) monitoring a session (supervisor and agent); 

(9) configuring a session parameters (administrator); and 

(10) sending commands and information to a consumer 
Master Applet or an administration Master Applet in a 
participating browser (supervisor, agent, and 
consumer). 

Database 156 is responsible for storing data collected in 
session table 145. 

HTTP server 152 is responsible for processing the 
requests issued by one of the web browsers, retrieving the 
web pages from either consumer page repository 146 or 
administration page repository 149, and sending the web 
pages to the browsers that have generated these requests. 

Database processing application 148 is responsible for 
writing the data collected in session table 145 into database 
156. 

Referring to FIG. 3, there is shown the process of how the 
(consumer) Master Applet, DTS Applets, and SessionID 
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Applet being downloaded into terminal 104A from HTTP time are stored in a URL history list and a command list 

server 152 in response to loading an initial web page, and created for the session. As indicated by dotted line (2), 

then being invoked to perform the operations in accordance HTTP server 152 retrieves web page 214 from consumer 

with the present invention. page repository 146 and sends it to browser 11 4 A Like web 

As shown in FIG. 3, a (consumer) Master Applet, a set of 5 page 204, web page 214 contains a set of applet tags 208 for 

DTS Applets, and a SessionlD Applet are embedded into indicating the location of Master Applet 124A, DTS Applets 

web page 204 by using a set of applet tags 208. Web page 126 A, and SessionlD Applet 128 A. As indicated by dotted 

204 is associated with a specific URL indicating the location line (3), web browser 114A loads web page 214. As indi- 

of web page 204 in HTTP server 152. cated by dotted line (4), in response to the loading of web 

As indicated by dotted line (1), web browser 114A sends to page 214, web browser 114A locates Master Applet 124 A, 
a request including the URL of web page 204 to HTTP DTS Applets 126A, and SessionlD Applet 128 A (based on 
server 152 via network 129. As indicated by dotted line (2), the indication of applet tags 208) that are cached by browser 
in response to the request, HTTP server 152 retrieves web 114A in memory area 115A, and initializes these Applets and 
page 204 from consumer page repository 146 and sends it to then invokes them. As indicated by line (5), Master Applet 
web browser 114A via network 129. Web page 204 contains is 124A opens a dedicated socket and establishes a socket 
a set of applet tags 208, which indicate the location of connection to WTS gateway 142 for browser U4A and web 
Master Applet, DTS Applets, and SessionlD Applet in HTTP page 214. Via the socket connection established for browser 
server 152. As indicated by dotted line (3), web browser 114 A and web page 214, Master Applet 126 A sends a 
114A loads web page 204. As indicated by dotted line (4), command, together with the ID unique to browser 114 A and 
since Master Applet, DTS Applets, and SessionlD Applet 20 the URL of web page 214, to inform WTS server 144 that 
have not been downloaded, web browser 114A sends web page 214 has been loaded. WTS server 144 issues a 
requests via network 129, to download these Applets based time stamp (loading time) indicating the time the command 
on applet tags 208. As indicated by dotted line (5), HTTP was received and stores the URL and time stamp of web 
server 152 sends Master Applet, DTS Applets, and Ses- page into the session created for browser 114A. As will be 
sionID Applet to browser 114A via network 129. As indi- 25 seen in the description in connection with FIG. 6, following, 
cated by dotted line (6), browser 11 4A stores Master Applet URL, command, and loading time are stored in a URL 
124A, DTS Applets 126 A, and SessionlD Applet 128 A into history list and a command list created for the session, 
memory area 115A of terminal 104A, and initializes and Referring to FIG. 5, there is shown the process of the 
invokes these Applets. After being invoked, these Applets (consumer) Master Applet 124 A, DTS Applets 12 6 A, and 
are running together with web browser 114A, to monitor and 30 SessionlD Applet 128 A being invoked, in response to load- 
process the activities for which they are assigned to be ing a subsequent web page 224 (subsequent to web page 
responsible. As indicated by line (7), Master Applet 124A 214), to perform the operations in accordance with the 
opens a dedicated socket and establishes a socket connection present invention, when both these Applets and web page 
to WTS gateway 142 for browser 114A and web page 204. 224 have been previously downloaded and cached by 
Via the socket connection, Master Applet 126 sends WTS 35 browser 114A in terminal 104A. 

server 144 a command, together with an ID unique to As indicated by dotted line (1), web browser 114A loads 

browser 114A. In response to the command from Master web page 224 cached in memory area 115 A maintained by 

Applet 126, WTS server 144 creates a session for browser browser 114A. Like web pages 204 and 214, web page 224 

114A based on the unique ID, and issues a time stamp contains a set of applet tags 208 indicating the location of 

(loading time) indicating the time at which the command 40 Master Applet 124 A, DTS Applets 12 6 A, and SessionlD 

was received, and stores the URL and time stamp of web Applet 128A. Before loading web page 224, the following 

page 204 into the session created for browser 114. As will events occur: (a) browser 11 4A instructs Master Applet 

see in the description in connection with FIG. 6 following, 124A to run a stop routine, (b) via the socket connection 

the URL, command, and loading time are stored in a URL established for browser 114A and web page 214A, Master 

history list and a command list created for the session. 45 Applet 124A sends a command to inform WTS server 144 

Referring to FIG. 4, there is shown the process of the that web page 214 has been unloaded, and disconnects the 

(consumer) Master Applet 124A, DTS Applets 126 A, and socket connection established for browser 11 4A and web 

SessionlD Applet 128 A being invoked, in response to load- page 214, (c) WTS server 144 issues a time stamp 

ing a subsequent web page 214 (subsequent to web page (unloading time) indicating the time the command was 

204), to perform the operations in accordance with the 50 received, and (d) WTS server 144 records the URL and time 

present invention, when these Applets have been previously stamp of web page 214 into the session created for browser 

downloaded and cached in terminal 104A. 114A. As will be seen in the description in connection with 

As indicated by dotted line (1), to download web page FIG. 6, following, the URL, command, and unloading time 

214, web browser 114A sends a request including the URL are stored in a URL history list and a command list created 

of web page 214 to HTTP server 152 via network 129. 55 for the session. As indicated by dotted line (2), in response 

Before loading web page 214, the following events occur: to the loading of web page 224, browser 114A locates 

(a) browser 11 4A instructs Master Applet 124A to run a stop Master Applet 124A, DTS Applets 126 A, SessionlD Applet 

routine, (b) via the socket connection established for 128A that have been cached by browser 114A in memory 

browser 114A and web page 204, Master Applet 124 sends area 115 A in terminal 104 A, and initializes and invokes 

a command to inform WTS server 144 that web page 204 has 60 these Applets. As indicated by line (3), Master Applet 124 A 

been unloaded, and disconnects the socket connection estab- opens a dedicated socket and establishes a socket connection 

lished for browser 114A and web page 204, (c) WTS server to WTS gateway 142 for browser 114A and web page 224. 

144 issues a time stamp (unloading time) indicating the time Via the socket connection established for browser 114A and 

the command was received, and (d) records the URL and the web page 224, Master Applet 126A sends a command, 

time stamp of web page 204 into the session created for 65 together with the ID unique to browser 114A and the URL 

browser 114A. As will be seen in the description in connec- of web page 224, to inform WTS server 144 that web page 

tion with FIG. 6, following, URL, command, and unloading 224 has been loaded. WTS server 144 issues a time stamp 
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(loading lime) indicating the lime the command was ID is associated with: (1) a session list for maintaining 
received and stores the URL and lime stamp into the session information about a session, (2) a participant list for main- 
created for browser 114. As will be seen in the description taining information about all participant browsers in a 
in connection with FIG. 6, following, the URL, command, session (note: when a session is first created, it only contains 
and loading lime are stored in a URL history list and a 5 one participant), (3) a URL history list for maintaining 
command list created for the session. information about all web pages visited by all parUcipants in 
In the example shown in FIG. 5, it should be appreciated \ sess ion (4) a data list for maintaining information about 
that even through no request arrives at HTTP server 144 te da f a ™** ^ web visited by all participants in 
when web page 224 isloaded from cached memory in ascsaon and (5) a command hst f^**™*"^*- 

• i i /\ !| * » a , jk , , n/1A _ . , „, . „ „ rt uon about all commands issued to WTS server 144 by the 

terminal ltW^Master Applet 124A still sends browsing 10 m & 

activities to WTS server 144. _ _ Typical items in a session list are: (1) SessionID for 

It should be noted that the processes shown in FIGS. 3-5 idemifying a scssioni (2) UserNa m e for indicating the actual 

of loading and invoking Master Applet, DTS Applets, and name for whom the xssion is crealed> (3) StartTune for 

SessionID Applet for terminal 104A can also be used for indicating the time of starting the session, (4) StopTime for 

terminals 104K, . . . , 104N. 15 indicating the time of stopping the session, and (5) Session- 

In FIGS. 3-5, Master Applet, DTS Applets, and Ses- Notcs f or recording the notes of the session. 

sionID Applet are all embedded into web pages 204, 214, Typical fields contained in a participant list are: (1) 

and 224. However, it should be noted that not all the Applets SessionID for linking the participant list to a session, (2) 

are required to be embedded into a web page. Depending on ParticipantlD for identifying a participant, (3) Participan- 

the desired functions to be performed, respective Applets 20 tAddresses for indicating a participant's IP address, (4) 

can be selectively embedded into a web page by selectively Class for indicating the user class of the participant 

setting applet tags in the web page. For example, if data (customer, agent, supervisor, administrator, etc.) and (5) 

synchronization and tracking of individual elements are not Direction for indicating the synchronization direction for the 

needed, the applet tags for linking DTS Applets can be participant browser. 

eliminated from the web page. By the same token, if 25 Typical fields contained in a URL history list are: (1) 

additional functions are needed, additional applet tags can SessionID for linking the URL history list to a session, (2) 

be added into the web page to link additional Applets, PageURL for indicating the URL of a web page visited, (3) 

Referring to FIG. 6, there is shown session table 145 (see ParticipantlD for identifying a participant who visited the 

FIG. 1) in greater detail, in accordance with the present web page, (4) LoadingTime for indicating the loading time 

invention. 30 of the web page, and (4) UnloadingTime for indicating the 

While browsers at their respective terminals are browsing unloading time of the web page, 
through the web pages in web site 134, WTS server 144 Typical fields contained in a data list are: (1) SessionID 
collects and analyzes the information about the interactions for linking the data list to a session, (2) WasRelayed for 
between all browsers and the web pages that have been indicating if this data field has been broadcasted, (2) Field- 
downloaded to the browsers from web site 134. One diffi- 35 Name for indicating the actual name of the data field, (3) 
culty in collecting and analyzing such information is that DataName for indicating the name of the data field displayed 
browsing individual web pages in web site 134 is a stateless on a web page, (4) Data Value for indicating the value of the 
process. More specifically, web site 134 receives a sequence data field, (5) TimeStamp for indicating the time at which 
of requests from different browsers, and sends the respective this data field is updated, (6) URL for indicating the web 
web pages to the respective browsers in response to the 40 page on which the data field was displayed, and (7) Partici- 
sequence requests. Since in processing the requests from an pantID for indicating the participant browser who updated 
individual browser, web site 134 does maintain a constant this data field. 

connection to the same browser to keep an one-to-one Typical fields contained in a command list are: (1) Ses- 

relationship, web site 134 has no control over, or maintain sionID for linking the data list to a session, (2) Command for 

data on, the sequences of the requests from the browsers. 45 indicating the specific command executed (loading a page, 

To meaningfully collect and analyze the information unloading a page, changing a data field, etc.), (3) URL for 

about the interactions between the browsers and web pages, indicating the web page to which the command operated, (4) 

a session is defined as a collection of web page interactions FieldPoint for indicating the data field to which the com- 

that occur over a given period of time from a specific mand operated, and (5) TimeStamp for indicating the time at 

browser. A session is created when a browser first hits web 50 which command was executed. 

site 134, and a session window (or session length interval) Before a session is purged from session table 145, data- 
is defined for the session. If activities from a specific base processing application 147 stores the associated session 
browser (identified by an ID unique to the browser, issued by list, URL history list, and command list to database 156. The 
a respective Master Applet) does not occur within the data contained in these three lists can be late used by data 
session window, the session is terminated and cleaned up by 55 warehouse integration applications. 
WTS server 144. A session window is refreshed (reset to Referring to FIG. 7, there is shown an operation for 
time zero) each time the information about the associated creating a session interface for an agent (or a supervisor) by 
browser is sent to WTS server 144. For example, if a session downloading an agent page (or a supervisor page) from 
window is defined as 15 minutes, as long as the associated administration page repository 149, in accordance with the 
terminal has some activity every 15 minutes, the session will 60 present invention. In the example shown in FIG. 7, it is 
remain open. After 15 minutes of inactivity, the session is assumed that administration user class (either agent user 
terminated and purged. A subsequent request from the same class or supervisor user class ) is assigned to terminal 104N, 
terminal will cause a new session to be created. After a so that the security application in HTTP server 152 grants 
session has been created for a terminal, one or more other the access to the web page stored in both consumer page 
terminals can join the session. 65 repository 146 and administration page repository 149. 

As shown in FIG. 6, session table 145 includes M Session At step 702, an agent at terminal 144N types in an agent 

IDs created for M sessions respectively. Each of the session URL at terminal 104N, and browser 114N sends the URL to 
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HTTP server 152, to retrieve an age at page, in which an Referring to FIG. 8C, there is shown a supervisor agent 

(administration) Master Applet and an Agent Applet are session interface 800C, in accordance with the present 

embedded. For a supervisor, he/she types in a supervisor invention. 

URL at terminal 104N, and browser 114N sends the URL to Referring to FIG. 9, there is shown a flowchart illustrating 

HTTP server 152, to retrieve a supervisor page, in which an 5 106 operation of joining a session by an agent, in accordance 

(administration) Master Applet and a Supervisor Applet are ^ih the present invention. 

embedded ' D lne exam pl e shown in FIG. 9, it is assumed that: (1) a 

At step 704, HTTP server 152 retrieves the agent page (or consumer at terminal 104A is browsing web pages from 

a supervisor page) from administration page repository 149 «»B™«r P a 8 e r ^ sl t°[ y 1 f 6 VU browi fr ( 2 ) ™° 

and £nds it to browser 114N. 10 ^ 1 sbow " m ™- 6 * as ^ CTC f d brow ^ r 

At step 706, browser 114N downloads the agent page, in fj * D *^ Dt da f , S A as f cd i0 b ™™ ^ / W 

... \ M a , / .- - w A t\ j agent session interface 800A shown in FIG. 8A has been 

which a Master Applet (administration Master Applet) and difi } d QD 10w (5) a (administration) Master 

an Agent Applet are embedded; or downloads the supervisor Applel and ^ em A lel have beeQ previouslv downloaded 

page, in which a Master Applet (administration Master ^ browser U4N> (6) a ded icated socket connection has 

Applet) and a Supervisor Applet are embedded. 15 been established for session interface 800A displayed at 

At step 708, browser 114N downloads the Master Applet terminal 104N by the (administration) Master Applet, and 

and Agent Applet from HTTP server 152, initializes and (7) the agent at terminal 104A is on duty at a call center, 

invokes these Applets; or downloads the Master Applet and As shown in FIG. 9, at step 902, the consumer is browsing 

Supervisor Applel from HTTP server 152, initializes and a web page at terminal 104A. On the web page, SessionID 

invokes these Applets. 20 Applet 128 A displays the current session ID. A call center 

At step 710, Master Applet opens a dedicated socket, telephone number the consumer can call is also displayed on 

establishes a socket connection to WTS gateway 142, and the web page. 

sends an ID unique to browser 114N to WTS server 144. At step 904, the consumer is connected to the call center 

WTS server 144 is able to identify browser U4N based on by dialing the telephone number via telephone 102 A (see 

the unique ID. 25 FIG. 1), and the call is directed by the call center to the 

At step 712, Agent Applet creates an agent session inter- agent, 

face 800A shown in FIG. 8 A for the agent user; or Super- At step 906, the consumer tells, via telephone 102 A (see 

visor Applet creates a supervisor session interface 800B FIG. 1), the agent the current session ID displayed. It should 

shown in FIG. 8B for the supervisor agent. be noted that, instead of using the telephone, the agent can 

Referring to FIG. 8A, there is shown an agent session 30 be informed of the current session ID by alternative meth- 

interface 800A created for an agent at step 712, in accor- ods. For example, the consumer can enter his/her telephone 

dance with the present invention. number into a special web page that contains the caller ID 

As shown in FIG. 8A, the session interface contains a text of the consumer along with the current session ID. This 

box 804 for entering a session ID, a Join session button 806 information can be stored into a special lookup table that can 

for joining a session identified by the session ID, a drop 35 be used by the agent to lookup the current session ID. * 

button 808 for leaving a session, a leader check box 810 At step 908, at terminal 104N, the agent types the current 

(selecting of which designates a browser as a leading session ID into text box 804 (see FIG. 8A). 

browser in synchronization), a follower check box 812 At step 910, in response to a loss of focus or a pressing 

(selecting of which designates a browser as a following of the Enter key, through the socket connection established 

browser in synchronization), a scrollable list box 816 for 40 for agent session interface 800A displayed on terminal 

displaying the information contained in the participant list 104N, the (administration) Master Applet at terminal 104N 

associated with a selected session, a scrollable list box 818 sends a command to WTS server 144, to retrieve the 

for displaying the information in an identified URL history information in participant list 1, URL history list 1, and data 

list, and a text box 820 for displaying the information in an list 1 (see FIG. 6) for the Agent Applet, 

identified data list. If both the leader and follower check 45 At step 912, WTS server 144 sends the information 

boxes 810 and 812 are selected in the agent session requested to the Agent Applet (via the Master Applet), 

interface, browser 114A acts as both leading and following At step 914, the Agent Applet at terminal 104N displays 

browser in synchronization. some information from participant list 1 and URL history list 

Referring to FIG. 8B, there is shown a browser supervisor 1 in (participant) scrollable list box 816 and (URL history) 

session interface 800B created for a supervisor at step 712, 50 scrollable list box 818, respectively, 

in accordance with the current invention. At step 916, the agent selects join button 806 in agent 

As shown in FIG. 8B, the session interface contains a session interface 800A displayed on terminal 104N. 
scrollable list box 832 for displaying session IDs of all active At step 918, in response to the selection at step 916, 
sessions in session table 145 and for selecting one of the through the socket connection which has been established 
session IDs, a text box 834 for displaying relevant statistics 55 for agent session interface displayed on terminal 104N, the 
of WTS server 144, a multi column scrollable list box 836 (administration) Master Applet sends WTS server 144 a 
for displaying details about the session selected in scrollable command to join the selected session. Based on the identi- 
list box 832, a select session button 838 for selecting a fication associated with the socket connection, WTS server 
session from scrollable list box 832. By using the inform a- is able to generate a Participant D for browser U4N and to 
tion in scrollable list box 832, a supervisor agent can 60 find the ParticipantAddress for terminal 104N. 
monitor all active sessions. By using the information in At step 920, WTS server 144 stores the ParticipantID and 
multi column scrollable list box 836, a supervisor can ParticipantAddress into participant list 1. At this step, par- 
monitor operational status of a session selected from scrol- ticipant list 1 includes two participant records (two rows) 
lable list box 832, including: (1) whether this session is containing the PaticipantlDs for browsers 114A and 114N 
being helped by an agent, (2) user name, and (3) agent ID. 65 respectively. 

By selecting select session button 838, a supervisor can At step 922, at terminal 104N, the agent selects: leading 

create an agent session interface as shown in FIG. 8C. check box 810 or following check box 812, or both of them. 
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By only selecting leader check box 810, ihe activities at memory area U5A in terminal 104A. If Master Applet 

terminal 104N are synchronized at terminal 104A, but not 124A, DTS Applets 126A, and SessionID Applet 128A had 

other way around. By only selecting follower check box not been download to browser 114A, browser 114A would 

812, the activities at terminal 104A arc synchronized at download these Applets from consumer page repository 146. 

terminal 104N, but not other way around. By selecting both 5 However, in this example, these Applets are assumed to be 

leader and follower check boxes 810 and 812, the activities downloaded. 

at terminals 104A and 104N are synchronized with each Al slep n0 6, in response to the loading of the web page, 

other (bidirectional synchronization). In response to the browser U4A inilializes and iQVokes Master Applet 124A, 

selection(s) through the socket connection whicfahu been DTS A lets n ^ and SessionID 12 8A. 

established for agent session interface 8 00 A, the 1Q Al slep U08 , Master Applet 124A: (1) opens a dedicated 

(administration) Master Applet sends WTS server 144 a . and establishes a socket connection to \VTS eatewav 

command designating the synchronization direction. WTS ^ c ' . . , , , , . f/-v - 

server 144 stores the synchronization direction information V* 2 ^ browser H4A and the web page loaded, and (2) via 

into the Direction fields of the two records in participation socket connection, sends WTS server 144 a command 

list 1. In this example, it is assumed that the bi-directional to g etber Wlih an ID unique to browser 114A and the URL of 

synchronization has been selected for terminals 104A and « the web page loaded. Based on the unique ID, WTS server 

104N. k a ^ e 10 identify the session created for browser 114A. 

At step 924, WTS server 144 sends the (administration) At step 1110, WTS server 144 identifies the session for 

Master Applet the URL of the web page being currently browser 114 A. 

browsed at terminal 104 A. At slep 1112, WTS server 144 locates all IP addresses 

At step 926, the Agent Applet at terminal 104N opens a 20 assigned to participant terminals in participant list 1 (shown 

browser window 1004 (a second browser instance) as shown in FIG. 6), and sends a command, together with the URL, to 

in FIG. 10. all the participant terminals (except that WTS server 144 

At step 928, browser 114N downloads the web page does not sent the URL to terminal 104A, because the URL 

identified by the URL from consumer page repository 146, is originated from terminal 104A). 

and displays it in browser window 1004. A (consumer) 25 At step 1114, upon receiving the command, the 

Master Applet, a set of DTS Applets, and a SessionID Applet (consumer) Master Applets in the participant terminals ini- 

are embedded in the web page downloaded. tialize themselves and pass the URL to their respective 

At step 930, browser 11 4N downloads (consumer) Master browsers. 

Applet 124N, set of DTS Applets 126N, and SessionID At step 1116, the respective browsers in the participant 

Applet 128N. 30 terminals download and display the web page according to 

At step 932, the web pages displayed in second browser the URL. 

window 1004 at terminal 104N are being synchronized with It should be noted that, like terminal 104A, each of the 

the web pages being displayed at terminal 104A. participant terminals (at which agent session interface is 

After step 932, if the agent (the first agent) at terminal displayed) can lead the page synchronization using the 

104A needs assistance from another agent (the second agent) 35 operation shown in FIG. 11. 

at terminal 104K, the first agent can call the second agent Referring to FIG. 12 A, there is shown a web page 
and tell him/her the current session ID. The second agent can containing five data fields, specifically: name 1202, time 
then join the current session using an agent session interface period 1204, account balance 1206, payment 1208, com- 
as shown in FIG. 8 A displayed at terminal 104K. ments 1210, a text box 1212 for displaying the current 

Referring to FIG. 10, there is shown a screen display 40 session ID, and a text box for displaying the call center 

containing two browser instances (800A and 1004) at ter- number the consumer can call, in accordance with the 

minal 104N, in accordance with the present invention, present invention. 

As shown in FIG. 10, at terminal 104N, the first browser Referring to FIG. 12B, there is shown a web page that is 

instance provides an agent session interface 800A to control similar to that of FIG. 12A, except that the data in the field 

and monitor the current session, and the (administration) 45 of name 202 is changed from Susan King to Sue Grant and 

Master Applet for agent session interface 800A establishes the changes are synchronized at a participant terminal, in 

and maintains a socket connection for agent session inter- accordance with the present invention, 

face 800A. The second browser instance provides a browser Referring to FIG. 13, there is shown a flowchart illustrat- 

window 1004 to display the web pages being synchronized. ing the operation of data synchronization, in accordance 

(Consumer) Master Applet 124N establishes and maintains 50 with the present invention. 

a socket connection for each web page displayed in browser In the example shown in FIG. 13, it is assumed that: (1) 

window 1004. a customer at terminal 104A is browsing web pages via 

Referring to FIG. 11, there is shown a flowchart illustrat- browser 114A, (2) a session has been created for terminal 

ing the operation of web page synchronization, in accor- 104 A, (3) session list 1 and participant list 1 shown in FIG. 

dance with the present invention. 55 6 has been created for the session, (4) terminal 104N is one 

In the example shown in FIG. 11, it is assumed that: (1) of the participants, (5) web page 1200 containing five data 

a consumer at terminal 104A is browsing web pages from fields shown in FIG. 12 A is displayed on terminals 104Aand 

consumer page repository 146 via browser 114A, (2) a all participant terminals, (6) a bi-directional synchronization 

session has been created for browser 114 A, (3) session list has been selected for terminal 104A and all participant 

1 and participant list 1 shown in FIG. 6 have been created 60 terminals, (7) the (consumer) Master Applet, DTS Applets, 

for the session, (4) bi-directional synchronization has been and SessionID Applet have been downloaded to browser 

selected for terminal 104 A and all participant terminals, and 114 A and the browsers at all participant terminals, (8) the 

(5) the (consumer) Master Applet, DTS Applets, and Ses- DTS Applets contains five individual Applets: DTS Applet^ 

sionID Applet have been downloaded into browser 104A DTS Applet 2 , DTS Applet 3 , DTS Applets^, and DTS 

and all participant browsers. 65 Applet 5 , (9) these five individual DTS Applets are respec- 

As shown in FIG. 11, at step 1104, browser 114A loads a tively responsible for monitoring and processing the events 

web page either from consumer page repository 146 or from occurred on the five data fields of web page 1200 shown in 
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FIG. 12A, (10) (consumer) Master Applet 124A has estab- 
lished a dedicated socket connection to WTS gateway 142 
for web page 12A displayed at terminal 104A, and (11) the 
customer at terminal 104A wants to make changes to name 
field 1202 from Susan King to Sue Grant. 

As shown in FIG. 13, at step 1304, the customer changes 
the name in name field 1202 from Susan King to Sue Grant. 

At step 1306, in response to a loss of focus on name field 
1202 or pressing the Enter key, DTS Applet j detects the 
change and passes the change to Master Applet 124A. 

At step 1308, via the dedicated socket connection, Master 
Applet 124 A sends WTS server 144 a command together 
with the change of name field 1202. Since this change is 
passed to WTS server 144 via the dedicated socket connec- 
tion established for web page 1200, WTS server 144 is able 
to recognize the origin of the command, web page 1200, and 
the name field upon which the change was made. 

At step 1310, WTS server 144 identifies the session 
created for browser 11 4A. 

At step 1312, WTS server 144 locates the IP addresses 
assigned to participant browsers in participant list 1 and 
sends a command (together with the change of name field 
1202) to the Master Applets in all participant terminals 
(except that WTS server 144 does not send the command 
and change to browser 11 4 A, since this change originated 
from browser 114A), 

At step 1314, upon receiving the command, the 
(consumer) Master Applets (including Master Applets 
124N) pass the change of name field 1200 to their respective 
DTS Applets, including the DTS Applet! at browser 114N. 

At step 1316, the DTS Applet^ display the update "Susan 
Grant" into the name fields on respective web page 1200 
displayed on the respective terminals, including terminal 
104N. 

It should be noted that the operation shown in FIG. 13 can 
be used to perform data synchronization for the other four 
data fields on web page 1200 shown in FIG. 12A. 

It should also be noted that the data field synchronization 
can also be performed at terminal 104N. For example, as 
shown in FIG. 12B, when the agent at terminal 104N enters 
comments of "Account's name had been changed" to com- 
ments field 1210' on web page 1200 1 , this updates will be 
displayed in comments field 1210 at terminal 104A, by using 
the operation shown in FIG. 13. 

Referring to FIG. 14, there is shown a flowchart illustrat- 
ing the operation of web page tracking, in accordance with 
the present invention. 

In the example shown in FIG. 14, it is assumed that: (1) 
a customer at terminal 104A is browsing web pages via 
browser 11 4A, (2) a session has been created for terminal 
104A and all participant terminals, (3) session list 1, par- 
ticipant list 1, and URL history list 1 shown in FIG. 6 have 
been created for the session, (4) bi-directional synchroniza- 
tion has been selected for terminal 104A and all participant 
terminals, and (5) the (consumer) Master Applet, DTS 
Applets, and SessionID Applet have been downloaded into 
terminals 104A and all participant terminals. 

As shown in FIG. 14, at step 1404, browser 11 4A down- 
loads a web page from either the consumer page repository 
146 or memory area 115A of terminal 104A. If Master 
Applet 124A, DTS Applets 126A, and SessionID Applet 
128A had not been download to terminal 104A, browser 
114A would download them from HTTP server 152. 
However, in this example, these Applets have been down- 
loaded. 

At step 1406, web browser 11 4A initializes and invokes 
Master Applet 124A, DTS Applets 126A, and SessionID 
Applet 128A. 
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At step 1408, Master Applet 124 A opens a dedicated 
socket and establishes a socket connection to WTS gateway 
142 for web browser 1 14A and the web page loaded. Master 
Applet 124A then sends WTS server 144 a command, 
5 together with: (1) an ID unique to browser 114A, and (2) the 
URL of the web page loaded. When commands and URL are 
delivered through this socket connection, WTS server 144 is 
able to recognize the origin of the commands and URL. 

At step 1410, WTS server 144 identifies the session ID for 
10 browser 114A. 

At step 1412, WTS server 144 locates the session list 1 
and URL history list 1. 

At step 1414, WTS server 144 issues a lime stamp 
(loading time) for indicating the time at which the command 
is was received, and stores the URL and time stamp to URL 
history list 1. 

At step 1416, browser 114A sends WTS server 144 a 
request to load a subsequent web page. 

At step 1418, before loading the subsequent web page, via 
20 the socket connection, Master Applet 124A sends WTS 
server 144 a command, together with the URL, to inform 
WTS server 144 that the current web page has been 
unloaded. 

At step 1420, WTS server 144 identifies the session for 

25 terminal 104A. 

At step 1422, WTS server 144 locates the session list 1 
and URL history list 1. 

At step 1424, WTS server 144 issues a time stamp 
(unloading time) for indicating the time at which the com- 

30 mand was received, and stores the URL and time stamp to 
URL history list 1. 

At step 1426, Master Applet 124A disconnects the socket 
connection for the web page that has been unloaded. 
Referring to FIG. 15, there is shown a flowchart illustrat- 

35 ing the operation of data tracking, in accordance with the 
present invention. 

In the example shown in FIG. 15, it is assumed that: (1) 
a customer at terminal 104A is browsing web pages via 
browser 114A, (2) a session has been created for terminal 

40 104 A, (3) session list 1 and participant list 1 shown in FIG. 
6 has been created for the session, (4) terminal 104N is one 
of the participants, (5) web page 1200 containing five data 
fields shown in FIG. 12A is displayed on terminals 104Aand 
all participant terminals, (6) a bi-directional synchronization 

45 has been selected for terminal 104A and all participant 
terminals, (7) the (consumer) Master Applet, DTS Applets, 
and SessionID Applet are downloaded to terminal 104Aand 
all participant terminals, (8) the DTS Applets contains five 
individual Applets: DTS Applet^ DTS Applet 2 , DTS 

50 Applet 3 , DTS Applets 4 , and DTS Applet 5 , (9) these five 
individual DTS Applets are respectively responsible for 
displaying, monitoring and processing the events occurred 
on the five data fields of web page 1200 shown in FIG. 12 A, 
(10) Master Applet 124Ahas established a dedicated socket 

55 connection to WTS server 144 for web page 1200 displayed 
on terminal 104A, and (11) the customer at terminal 104A 
wants to make changes to name field 1202 from Susan King 
to Sue Grant. 

As shown in FIG. 15, at step 1504, the customer changes 
60 the name in name field 1502 from Susan King to Sue Grant. 
At step 1506, in response to a loss of focus on name field 
1202 or pressing the Enter key, DTS Applet^ detects the 
change and passes the change to Master Applet 124A. 
At step 1508, via the dedicated socket connection, Master 
65 Applet 124 A sends WTS server 144 a command together 
with the change of name field 1202. Since this change is 
passed to WTS server 144 via the dedicated socket connec- 
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tion established for web page 1200, WTS server 144 is able Al this step, participant list 1 includes three participant 

to recognize the origin of the command, web page 1200, and records (three rows) for browsers 11 4A, 11 4K, and 114N 

the name field upon which the change was made. respectively. 

At step 1510, WTS server 144 identifies the session At step 1618, Agent Applet at terminal 104K displays the 

created for terminal 104A. 5 information stored in participant list 1, and URL history list 

At step 1512, WTS server 144 stores the URL and update 1 m participant text box 856, and URL history text box 858 

of name field 1202 into data list 1. nG * 8C ) respectively. 

It should be noted that the operation shown in FIG. 15 can , T * st c e P 1620 j WTS server 144 sends Master-Apple^ the 

be used to perform data tracking for the other four data fields URL of J« JjJ P a S e bem S currently displayed at terminals 

on web page 1200, and to perform data tracking for all 10 * \. A . A , . t 

narticinant Terminal At StC P 1622 ' thc A S eDt A PP lct al tc nnmal 104K opens a 

participant terminal third browser instance (see 1704 in FIG. 17). 

Referring to FIG 16, there is shown a flowchart dlustrat- M 1624 brow ^ r n4K dowQlo ads the web page 

mg the operation of joining a session by a supervisor, in identified by lhe URL from co^er page repository 146 

accordance with the present invention. In the example (or bads the web page from me area U5K in iennin ^ 

shown in FIG. 16, it is assumed that the supervisor is on duty 15 i04 K if it is cached there), and displays it in the third 

at terminal 104K in a call center. browser instance (see 1704 in FIG, 17). 

As shown in FIG. 16, at step 1604, the supervisor per- At step 1626, browser 114K downloads (consumer) Mas- 
forms the steps shown in FIG. 7, where the supervisor ter Applet 124K, DTS Applets 126K, and SessionID Applet 
downloads a supervisor page on which a (administration) 128 from consumer page repository 146 according to the 
Master Applet (referred as Master-Applet J and a Supemsor 20 applet tags in the current web page (assuming that these 
Applet are imbedded. The Supervisor Applet displays a Applets have not previously downloaded), 
supervisor session interface (as shown in FIG. 8B) in a first At step 1628, Master Applet 124K opens a dedicated 
browser instance (see 800B in FIG. 17) on terminal 104K. socket and establishes a socket connection to WTS gateway 
Master- Applet maintains a dedicated socket connection to 142 for the third browser instance 1704 shown in FIG. 17. 
WTS gateway 142 for the first browser instance (see 800B 25 After step 1630, the web pages displayed in third browser 
shown in FIG. 17). instance 1704 at terminal 104K are being synchronized with 

At step 1606, from the first browser instance (see 800B the web pages being displayed at terminals 104A and 104N, 

sho wn in FIG. 17), the supervisor selects a session ID (listed Referring to FIG. 17, there is shown three browser 

in text box 832) and then select session button 838. instances (800B, 800C, and 1704) for the supervisor in 

At step 1608, in response to the selection of the select 30 response to the steps shown in FIG. 16, in accordance with 

session button 838, browser 114K downloads a supervisor the present invention. 

agent page, on which a (administration) Master Applet Referring to FIG. 18, there is shown a flowchart illustrat- 

(referred as Master- Apple Lj) and an Agent Applet are ing the operation of re -browsing a web page previously 

embedded. The Agent Applet creates a supervisor agent reviewed in a session, in accordance with the present inven- 

session interface 800C (as shown in FIG. 8C) and displays 35 tion. 

it in a second browser instance (see 800C in FIG. 17) on In the example shown in FIG. 18, it is assumed that: (1) 

terminal 104K. Master- Applet opens and maintains a dedi- a consumer at terminal 104A is browsing web pages from 

cated socket connection to WTS gateway 142 for the second consumer page repository 146 via browser 114A, (2) a 

browser instance (see supervisor agent session interface session list 1 shown in FIG. 6 has been created for browser 

800C shown in FIG. 17). 40 114A, (3) an agent (or a supervisor) is on duty at terminal 

At step 1610, there can be two possible scenarios. A first 104N in a call center, and agent class (or supervisor class) 

scenario is that: an agent has joined the selected session to has been assigned to browser 104N, (4) at browser 114N, the 

help the consumer, and the supervisor wants to join the first and second browser instances for the agent as shown in 

selected session as a participant. Under this scenario, the FIG. 10 (or the first, second and third browser instances for 

supervisor simply selects join button 846, and leader and 45 the supervisor as shown in FIG. 17) have been displayed, (5) 

follower buttons 850 and 852 are both selected automati- via their respective socket connections established by their 

cally. A second scenario is that: no agent has joined the Master Applets, the first and second browser instances for 

session and the supervisor wants to join the session. Under the agent as shown in FIG. 10 (or the first, second and third 

this scenario, the supervisor joins the session just like an browser instances for the supervisor as shown in FIG. 17) 

agent, by first selecting leader button 850 and/or follower 50 have been connected to WTS gateway 142, (6) Master 

button 852, and then join session button 846. In this Applets (124A and 124N), DTS Applets (126A and 126N) 

example, it is assumed that a consumer at terminal 104Aand and SessionID Applets (128A and 128N) have been down- 

an agent at terminal 104N have joined the session, and the loaded into terminals 104A and 104N respectively, (7) the 

supervisor wants to join the selected session. agent (or supervisor) has selected and joined the session 

At step 1612, the supervisor selects join session button 55 created for browser 114 A, (8) at browser 114N, the second 

846. browser instance for the agent as shown in FIG. 10 (or the 

At step 1614, via the socket connection for the second third browser instance for the supervisor as shown in FIG. 

browser instance (see supervisor agent session interface 17) is being synchronized with browser U4A, and (9) 

800C shown in FIG. 17), the Master-Applet^ sends WTS bi-direction synchronization has been selected for browsers 

server 144 a command together with the selected session ID 60 114A and 114N. 

for the selected session. At step 1802, for an agent user, via scrollable list box 818 

At step 1615, WTS server 144 locates the session indi- on agent session interface shown in FIG. 10, he/she reviews 

cated by the selected session ID, and sends information the URLs for all the web pages previously browsed by 

stored in participant list 1 and URL history list 1 (see FIG. browser 114A in the selected session. For a supervisor, via 

6) to Master-Applet 2 . 65 scrollable list box 858 on supervisor session interface shown 

At step 1616, WTS server 144 stores ParticipantID and in FIG, 17, he/she reviews the URLs for all the web pages 

ParticipantAddress into participant list 1 for browser 114K. previously browsed by browser 114Ain the selected session. 
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At step 1804, to display an individual web page previ- have been downloaded into terminals 104 A and 104N 

ously browsed by browser U4A, the agent (or supervisor) respectively, (7) the agent (or supervisor) has selected and 

selects a URL from scrollable list box 818 (or scrollable list joined the session created for browser 114A, (8) at browser 

box 858) and double-clicks on it. 114N, the second browser instance for the agent as shown in 

At step 1806, for the agent, the (agent) Master Applet or 5 FIG 10 ( or the third browser instance for the supervisor as 

(the supervisor Master Applet) sends WTS server 144 a shown in FIG - 17 ) * bein S synchronized with browser 

command together with the selected URL, via its respective 114A * and ( 9 ) bi-direction synchronization has been selected 

socket connection for browsers n4A and U4N - 

At 1808, WTS server 144 sends a command together with At ste P 1902 ' f P r "j ' a S eD < ^ ^ r ° U n ab u le jf X box 818 

the URL and the time information (loading and unloading) to ° n "f I0D s K hown m FIG. 10 hcfche reviews 

frow^^^ 

1 1 i N f ° rm / he y e ^c live browsers 114A ^ bQX g58 QD ^ fiesfifan iQl £ ace showQ 

and 114N to load and display the web page based on the fa HG 17> he/sbe feviews lhe URLs for ^ the web pages 

previously browsed by browser 11 4A in the selected session. 

At 1810, WTS server 144 checks whether browser 114A is At step ion^ to display all web pages previously browsed 

previously performed any activities to data fields on the web by browser 114A, the agent (or supervisor) selects Go to 

page identified by the URL, based on the information stored URLs button 820 in the agent session interface as shown in 

in URL history list 1 and data list 1. As shown in FIG. 6, FIG. 10 (or Go to URLs button 860 in the supervisor session 

URL history 1 contains the information about: (a) participant interface as shown in FIG. 17. 

ID of browser 114A, (b) the URL, and (c) the loading and 20 At step 1906, the (agent) Master Applet, or the 

unloading time of the web page identified by the URL. Data (supervisor) Master Applet, sends a command to WTS 

list 1 contains the information about: (a) data field names for server 144. 

data fields, (b) value of the data fields, and (c) the times at At 1908, WTS server 144 sequentially sends commands, 

which values of the data fields were changed. together the URLs and time information, to Master Applets 

At step 1812, if browser 114 previously performed any 25 124Aand 124N, so that Master Applets 124Aand 124N can 

activities to the data fields on the web page identified by the inform their respective browsers 114A and 114N to load and 

URL, WTS server 144 sends a command (together with the display the web pages based on the URLs, 

data field names, values of the data fields, and time At 1910, for each one of URLs that are sent together with 

information) to Master Applet 124 A (at browser 114A) and the commands, WTS server 144 checks whether browser 

Master Applet 124N (at browser 114N). 30 114A previously performed any activities to data fields on a 

At step 1814, at browser 114A, Master Applet 124A web page identified by a respective URL, based on the 

passes the command, data field names, and data field values information stored in URL history list 1 and data list 1. 

to DST Applets 126A,so that DTS Applets 124A can display At step 1912, if browser 114 previously performed any 

the data field values into respective data fields on the web activities to the data fields on the web page identified by a 

page being displayed. At browser 114N, Master Applet 35 respective URL, WTS server 144 sends a command 

124N passes the command, data field names, and data field (together with the data field names and values of the data 

values to DST Applets 126N, so that DTS Applets 124N can fields) to Master Applets 124A (at browser 114A) and 

display the data field values into respective data fields on the Master Applet 124N (at browser 114N). 

web page being displayed. At step 1914, at browser 114A, Master Applet 124A 

Since the loading time and unloading time of the URL and 40 passes the command, data field names, and data field values 

the setting time for a data field are recorded in URL history to DST Applets 126A, so that DTS Applets 124Acan display 

list 1 and data list 1, if desired, the web page identified by the data field values into respective data fields on the web 

the URL and the activities performed to the data fields can page being currently displayed. At browser 11 4N, Master 

be duplicated (loading the web page, setting data fields on Applet 124N passes the command, data field names, and 

the web page, and unloading the web page) according to the 45 data field values to DST Applets 126N, so that DTS Applets 

time information. 124N can display the data field values into respective data 

Referring to FIG. 19, there is shown a flowchart illustrat- fields on the web page being currently displayed, 

ing the operation of re-browsing all web pages previously Since the loading time and unloading time of the URLs 

reviewed in a session, in accordance with the present inven- and the setting time for data fields are recorded in URL 

tion. 50 history list 1 and data list 1, if desired, all the web pages 

In the example shown in FIG. 19, it is assumed that: (1) identified by the URLs and the activities performed to the 

a consumer at terminal 104A is browsing web pages from data fields can be duplicated (loading the web page, setting 

consumer page repository 146 via browser 114A, (2) a data fields on the web page, and unloading the web page) 

session list 1 shown in FIG. 6 has been created for browser according to the timing information. 

114A, (3) an agent (or a supervisor) is on duty at terminal 55 It should be noted that, in the above-described 

104N in a call center, and agent class (or supervisor class) embodiments, all the Applets (Master Applets, DTS Applets, 

has been assigned to browser 104N, (4) at browser 114N, the SessionID Applets, and Agent Applet) embedded into web 

first and second browser instances for the agent as shown in pages are written using Java. However, some or all of these 

FIG. 10 (or the first, second and third browser instances for Applets can be written using a browser script language, such 

the supervisor as shown in FIG. 17) have been displayed, (5) 60 as Java Script. More specifically, the codes for these Applets 

via their respective socket connections established by their can be selectively written into web pages using the browser 

respective Master Applets, the first and second browser script language, instead of using applet tags to link these 

instances for the agent as shown in FIG. 10 (or the first, Applets. When a web browser downloads a web page 

second and third browser instances for the supervisor as containing the Applets written in browser script language, it 

shown in FIG. 17) have been connected to WTS gateway 65 stores these Applets into the memory area of the terminal on 

142, (6) Master Applets (124A and 124N), DTS Applets which the web browser is running, and then initializes and 

(126A and 126N) and SessionID Applets (128A and 128N) invokes them. 
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The present invention has the following advantages: 
Dependable web page tracking and synchronizing — It 
tracks and synchronizes all user activities, even if web pages 
come from cached pages stored in browser cache or proxy 
servers. 

Ease of use — It eliminates the current manual process of 
multiple users separately re-creating the web navigation. 

Ease of execution (from users* point of view) — It does not 
require additional software to support the present invention. 
No software needs to be installed, configured, or run by a 
user. 

Portability — It works across different operating systems 
at both client and server sides. On the client side, the 
requirement is that there be a web browser that supports Java 
Applets. On the server side, the requirement is that there be 
a Java Virtual Machine (JVM) on the same server that 
provides the HTTP service. Since there are JVMs practically 
for very operating system, the server components of the 
present invention have the potential to run on all the oper- 
ating systems. 

Compatibility — It works together with any HTTP servers 
from different vendors because the server components of the 
present invention requires no processing by HTTP servers, 
and thus are independent from HTTP servers. 

Flexibility — Web page synchronization can be used inde- 
pendently in conjunction with web page tracking. Web page 
synchronization does not require persistent storage of any of 
the data tracked. 

User privacy — It ensures a reasonable level of user 
privacy, since tracking and synchronization is limited to 
pages served by a web site that the information provider has 
control over. 

Multiple HTTP server supported — It can handle the situ- 
ation where a company has multiple physical servers run- 
ning its web site, since the separation of the WTS gateway 
and server components enables a gateway to be placed on 
each HTTP server — each communicating with a common 
WTS server. 

While the invention has been illustrated and described in 
detail in the drawing and foregoing description, it should be 
understood that the invention may be implemented through 
alternative embodiments within the spirit of the present 
invention. Thus, the scope of the invention is not intended to 
be limited to the illustration and description in this 
specification, but is to be defined by the appended claims. 

What is claimed is: 

1. A method for tracking interactions with pages that have 
been loaded from a web server to a terminal during a user 
session, and for storing information about the interactions to 
a sage tracking server, comprising the steps of: 

loading a first page from the web server, the first page 
being associated with a page locator for indicating a 
location of the first page in the web server, and the first 
page containing location information for indicating a 
location of a program; 
loading the program from the web server based on the 

location information, and executing the program; 
the program monitoring interactions with the page; and 
the program sending information about the interactions to 
the page tracking server during the session, 
creating a session table using sent information about 

the interactions, 
creating a sessionID for the session table wherein each 
sessionID is associated with a session list for main- 
taining information about a session, a participant list 
for maintaining information about all participant 
browsers in a session, a URL history list for main- 



taining information about all web pages visited by all 
participants in a session, a data list for maintaining 
information about the data fields on the web pages 
visited by all participants in a session, and a com- 
5 mand list for maintaining information about all com- 

mands issues to the server by the various participants 
in a session, 

wherein the data list includes data fields for a Session 
ID for linking the data list to a session, a WasRelayed 

to for indicating if this data field has been broadcasted, 

a FieldName for indicating the actual name of the 
data field, a DataName for indicating the name of the 
data field displayed on a web page, a Data Value for 
indicating the value of the data field, a Time St amp 

is for indicating the time at which this data field is 

updated, a URL for indicating the web page on which 
the data field was displayed, and a ParticipantID for 
indicating the participant browser who updated this 
data field. 

20 2. The method of claim 1, further comprising the steps of: 
loading a second page from the web server, the second 
page being associated with a page locator for indicating 
a location of the second page in the web server, the 
second page containing location information for indi- 

25 eating a location of the program loaded in said loading 
step; and 

executing the program based on the location information 

in the second page; 
the program monitoring interactions with another page; 
and 

the program sending the information about the interac- 
tions with the second page to the page tracking server 
during the session. 

3. The method of claim 2, further comprising the step of: 
storing the information about the interactions with the 

second page in the page tracking server during the 
session. 

4. The method of claim 2, comprising storing the sent 
information from the web servers. 

5. The method of claim 2, comprising collecting and 
analyzing the stored information about the interactions. 

6. The method of claim 1, further comprising the step of: 
storing the information about the interactions with the first 

page in the page tracking server. 

7. The method of claim 1, the page locator being URL, 
and the location information being a tag. 

8. The method of claim 1, the web server being an HTTP 
server. 

9. The method of claim 1, comprising opening a dedicated 
socket between the terminal and a gateway. 

10. The method of claim 1, comprising informing the page 
tracking server that the first page has been loaded on the 
terminal. 

55 11. The method of claim 1, wherein a session is defined 
as a collection web page interactions that occur over a given 
period of time from a specific browser. 

12. The method of claim 1, further comprising creating a 
session table using sent information about the interactions. 

60 13. The method of claim 12, comprising creating a 
sessionID for the session table wherein each sessionID is 
associated with a session list for maintaining information 
about a session, a participant list for maintaining informa- 
tion about all participant browsers in a session, a URL 

65 history list for maintaining information about all web pages 
visited by all participants in a session, a data list for 
maintaining information about the data fields on the web 
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pages visited by all participants in a session, and a command 
list for maintaining information about all commands issues 
to the server by the various participants in a session. 

14. The method of claim 13, wherein the session list 
includes a sessionID for identifying a session, a user name 5 
for indicating the actual name for whom the session is 
created, a start time for indicating the time of starting the 
session, a stop time for indicating the time of slopping the 
session, and session notes for recording tbe notes of the 
session. Q 

15. The method of claim 14, wherein the participant list 
includes a sessionID for Unking the participant list to a 
session, a participantID for identifying a participant, Par- 
ticipantAddresses for indicating a participant's IP address, a 
class for indicating a user class of the participant, and a 
direction for indicating the synchronization direction for the 15 
participant browser. 

16. The method of claim 13, wherein the URL history list 
includes a SessionID for linking the URL history list to a 
session, a Page URL for indicating the URL of a web page 
visited, a ParticipantID for identifying a participant who 20 
visited the web page, a loading time for indicating the 
loading time of the web page, and an unloading time for 
indicating the unloading time of the web page. 

17. The method of claim 13, wherein the command list 
includes data fields for a Session ID for linking the data list 2 5 
to session, a Command for indicating the specific command 
executed, a URL for indicating the web page to which the 
command operated, a FieldPoint for indicating the data field 

to which the command operated, and a TimeStamp for 
indicating the time at which command was executed. 3Q 

18. The method of claim 1, further comprising creating a 
session interface for an administrator to monitor the infor- 
mation about the interactions. 

19. The method of claim 18, comprising joining the 
administrator to a session to monitor tbe information about 35 
the interactions. 

20. The method of claim 1, further comprising creating a 
browser supervisor session interface to monitor the infor- 
mation about the interactions. 

21. A method for tracking interactions with pages that 4Q 
have been loaded from a web server to a terminal during a 
user session, and for storing information about the interac- 
tions to a page tracking server, comprising the steps of: 

loading a first page from the web server, the first page 
being associated with a page locator for indicating a 45 
location of the first page in the web server, and the first 
page containing location information for indicating a 
location of a program; 
loading the program from the web server based on the 

location information, and executing the program; 50 
the program monitoring interactions with the page; and 
the program sending information about the interactions to 
the page tracking server during the session, 
joining the administrator to a session to monitor the 

information about the interactions, 55 
wherein the session interface includes a text box for 
entering a SessionID, a join session button for join- 
ing a session identified by the Session ID, a drop 
button for leaving a session, a leader check box, a 
follower check box, a scrollable list box for display- 60 
ing the information contained in the participant list 
associated with a selected session, a scrollable list 
box for displaying the information in an identified 
URL history list, and a text box for displaying the 
information in an identified data list. 65 

22. The method of claim 21, further comprising the steps 
of: 
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loading a second page from the web server, the second 
page being associated with a page locator for indicating 
a location of the second page in the web server, the 
second page containing location information for indi- 
cating a location of the program loaded in said loading 
step; and 

executing the program based on the location information 

in the second page; 
tbe program monitoring interactions with another page; 

and 

the program sending the information about the interac- 
tions with the second page to the page tracking server 
during the session. 

23. The method of claim 21, further comprising the step 
of: 

storing the information about the interactions with the first 
page in the page tracking server. 

24. The method of claim 21, the page locator being URL, 
and the location information being a tag. 

25. The method of claim 21, the web server being an 
HTTP server. 

26. The method of claim 21, comprising opening a 
dedicated socket between the terminal and a gateway. 

27. The method of claim 21, comprising informing the 
page tracking server that the first page has been loaded on 
the terminal. 

28. The method of claim 21, wherein a session is defined 
as a collection of web page interactions that occur over a 
given period of time from a specific browser. 

29. A method for tracking interactions with pages that 
have been loaded from a web server to a terminal during a 
user session, and for storing information about the interac- 
tions to a page tracking server, comprising the steps of: 

loading a first page from the web server, the first page 
being associated with a page locator for indicating a 
location of the first page in the web server, and the first 
page containing location information for indicating a 
location of a program; 
loading the program from the web server based on the 

location information, and executing the program; 
the program monitoring interactions with the page; and 
the program sending information about the interactions to 
the page tracking server during the session, 
creating a browser supervisor session interface to moni- 
tor the information about the interactions, 
wherein the supervisor session interface includes a 
scrollable list box for displaying session IDs of all 
active sessions in the session table and for selecting 
one of the session IDs, a text box for displaying 
relevant statistics of the server, a multi-column scrol- 
lable list box for displaying details about the session 
selected in the scrollable list box, a select session 
button for selecting a session from the scrollable list 
box. 

30. The method of claim 29, comprising monitoring 
operational status of a session using the scrollable list box. 

31. The method of claim 29, further comprising the steps 
of: 

loading a second page from the web server, the second 
page being associated with a page locator for indicating 
a location of the second page in the web server, the 
second page containing location information for indi- 
cating a location of the program loaded in said loading 
step; and 

executing the program based on the location information 
in the second page; 
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the program monitoring interactions with another page; 
and 

the program sending the information about the interac- 
tions with the second page to the page tracking server 
during the session. 

32. The method of claim 29, further comprising the step 

of: 

storing the information about the interactions with the first 
page in the page tracking server. 

33. The method of claim 29, the page locator being URL, 
and the location information being a tag. 
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34. The method of claim 29, the web server being an 
HTTP server. 

35. The method of claim 29, comprising opening a 
dedicated socket between the terminal and a gateway. 

36. The method of claim 29, comprising informing the 
page tracking server that the first page has been loaded on 
the terminal. 

37. The method of claim 29, wherein a session is defined 
as collection of web page interactions that occur over a given 

^ period of time from a specific browser. 

* * * * * 
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FIG. 11 
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FIG. 12A 



1200 



TELEPHONE BILL 












^-1202 


Name 


Susan King 












✓-1204 


Time Period 


07/01/95-08/01/95 












,-1206 


Account Balance 


$100.00 










r 


-1208 


Payment 










x-1210 


Comments 












^-1212 


SessionID 


1234567 










^1214 


Call Center Number 


1-800-456-7777 

















08/12/2003, EAST Version: 1.04.0000 



U.S. Patent 



Jul. 9, 2002 



Sheet 15 of 20 



US 6,418,471 Bl 



r 



FIG. 12B 
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METHOD FOR RECORDING AND has no relationship to previous or subsequent requests. In 

REPRODUCING THE BROWSING processing of requests, a web site has no control over the 

ACTIVITIES OF AN INDIVIDUAL WEB sequences of the requests. 

BROWSER Currently it is very difficult to accurately capture the 

5 browsing activities of a person visiting a web site. Analyzing 

RELATED APPLICATIONS pa gc requests received by a web server does not accurately 

rwm « • , ; • . r ,• Cor vt reflect the actual sequence of pages visited by an individual 

This is a continuation-in-part of application Ser. No. . . ~ ^ r & 3 

08/944,124 filed Oct. 6, 1997, now U.S. Pat. No. 5,951,643. browser. ims 15 because: 

™ . , , „ . „ , j Web pages retrieved by a browser from browser cache or 

The present invention relates generally to a method and in r& •« L 

r i- , j a proxy server will not be recorded as a page request on 

system for record ing and reproducing the activities per- t u u 

r , ■ 'j'-ji \. i tne weo server. 

formed by an individual web browser. . . 

A web server does not record the timing characteristics of 

BACKGROUND OF THE INVENTION a user's interaction with a form because this interaction 

is handled directly by the web browser. The final state 

Web browsers, such as Microsoft Corporation's Internet of tne ^ enle red into a form is sent by the browser to 

Explorer internet browser and Netscape Communications tne werj server w ben the form is submitted. 

Corporation's Navigator™ internet browser, provide a Because of the limited quality (granularity) of data main- 

graphical, easy-to-navigate interface for retrieving and by the ser ver, it is currently not possible for a 

viewing information available from the Web. The World server-based application to reproduce the sequence of activi- 

Wide Web utilizes a system known as a hypertext system to 20 ues performed by a specific browser. Web servers currently 

facilitate navigation through the World Wide Web environ- record statistics on each page requested, but do not reliably 

mem. The hypertext system employs special connections, or keep lrack of eacn p age mat is displayed at a user's browser, 

links, which are embedded into documents displayed or re cord page requests retrieved from the browser's cache 

through use of internet browser software. Clicking on a or a proxy server. Nor does the web server currently record 

word, phrase, image or thumbnail graphic including one of 25 browser activities including the timing of each data entry 

these links instructs the browser to retrieve a document, j nt0 an HTML form. There are no known tools thai are 

graphic, sound, or other information associated with the capable of recording detailed browser activities and then 

embedded link. later providing a mechanism to replay the same sequence of 

Documents published electronically in hypertext form, 3Q browser activities through one or more browsers, 
i.e., documents constructed utilizing Hypertext Markup Lan- 
guage (HTML), are becoming a defacto standard for sharing OBJECTS OF THE INVENTION 
information online. HTML is a language for describing It is therefore an object of the present invention to provide 
structured documents. Documents written in HTML are a new and useful mechanism to record the detailed browsing 
ASCII text documents including: (1) the text of the 35 activities of an individual browser— regardless of whether 
document, and (2) HTML tags that identify document the information displayed by the browser was obtained from 
elements, formatting and structure, as well as links to browser cache> proxy server cache or a web server, 
graphic files, other documents, or other information. . . . , 
Broker software interprets the HTML tags included in It is another object of me present inven tor Jto pnmde 
downloaded documents to properly display the text and a() ™* a mechanism which includes the ability to store each 
referenced elements. Images and other referenced elements 40 ^ ^ uesl a ° d each P»* 0 data catc f 0 an H ™ L f 
are not normally contained in the downloaded document, alon § Wlth a Umc stam P t0 kcc P track of whcn cach of 
however, and must be separately downloaded in order to * c browser activities took P lacc " 

completely display the HTML document and referenced It is a further object of the present invention to provide a 

elements new and useful mechanism to force a browser to reproduce 

~ , p j fr „„, nc „ ml i nc . iho 45 a recorded set of browser activities at one or more browsers. 

Documents, also referred to as web pages, as well as the 

image files, additional document files and other elements It is yet another object of the present invention to provide 

referenced in the web pages may be maintained on one or such a mechanism wherein the set of recorded browser 

multiple web servers remote from the user retrieving and activities can be reproduced at the same rate as the original 

viewing the web pages. With the development of informa- 50 browser activities, or can be reproduced at proportionately 

tion technology and networking infrastructure, more and slower or faster rates. 

more businesses, government agencies, universities and It is an additional object of the present invention to 

individuals maintain web sites and conduct business over the provide such a mechanism wherein the playing of a 

Internet. For example, a service provider can display its sequence of web pages may be started, stopped or paused at 
service in a set of web pages maintained on a web server 55 any URL in the sequence of pages recorded. 

owned by the provider. SUMMARY OF THE INVENTION 

To improve the service and assistance provided to con- 
sumers using the Internet, it is desirable to provide a In one aspect, the present invention provides a method for 
mechanism to record the detailed browsing activities of an repeating browsing activities performed by a customer or 
individual browser, or to force a browser to reproduce the 60 user. The method comprises the steps of: (a) recording 
browser activities recorded at one or more browsers. The browsing activities as they are performed at a first terminal; 
ability to record and reproduce browser activities is useful in (b) retrieving to a second terminal the browsing activities 
a variety of applications such as consumer behavior recorded at the first terminal; and (c) repeating at the second 
analysis, quality assurance, and training applications. terminal the browsing activities recorded at the first termi- 

One difficulty in achieving the above objectives is that 65 nal- 

web servers are designed to serve each individual web The present invention further provides a method for 

browsers in a stateless manner where one browser request recording the activities of a number of customers using a 
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browser and then selecting one browser session for replay at FIG. 13 shows a flowchart illustrating the operation of 

another browser. The method comprises the steps of: (a) data synchronization, in accordance with the present inven- 

browsing web pages at a plurality of user browsers; (b) tion. 

creating a plurality of sessions at a network server, each of FIG. 14 shows a flowchart illustrating the operation of 

said sessions being associated with a respective one of said 5 weD pagc tracking, in accordance with the present invention. 

plurality of user browsers; (c) recording the browsing aciivi- FIG. 15 shows a flowchart illustrating the operation of 

ties for each of the user browsers into an associated session data ^ ^ accordance ^ tDe t mverjtion . 

at said network server; (d) select ing one of the sessions; and _ T _ ^ m . 

(e) repeating activities recorded for the selected session at an ^IG. 16 shows L a flowchart illustrating the operation of 

administrative browser. We could also replay to more than 10 ^-browsing a web page previously browsed in a session, in 

one browser at the same time. accordance with the present invention. 

FIG. 17 shows a flowchart illustrating the operation of 

BRIEF DESCRIPTION OF THE DRAWINGS re-browsing all web pages previously browsed in a session, 

_ . . c . ... in accordance with the present invention. 

The purpose and advantages of the present invention will r 

be apparent to those skilled in the art from the following 15 DETAILED DESCRIPTION OF THE 

detailed description in conjunction with the appended PREFERRED EMBODIMENT 

drawing, in which: 

FIG. 1 shows a system includes N terminals, a network, The following description is presented to enable any 

and a web site, in accordance with the present invention. P erson skilled in the art to make and use the invention, and 

FIG. 2 shows a situation where each of the N terminals * P rovided k *<; c « of a particular application and its 

has downloaded its respective Master Applets, DTS Applets, requirements. Various modifications to the preferred 

jo m a i * • j -.u *u „ * ■ embodiments will be readily apparent to those skilled in the 

and SessionID Applet, in accordance with the present mven- , , . . . , „ rr . . ... 

. art, and the principles denned herein may be applied to other 

embodiments and applications without departing from the 

FIG. 3 shows the process of the (consumer) Master 25 spirit and scope of the invention. Thus, the present invention 

Applet, DTS Applets, and SessionID Applet being down- fc QOl inlended lo ^ limited to lhe embodiments shown, but 

loaded into a terminal, in accordance with the present is to be accorded with the broadest scope consistent with the 

invention. principles and features disclosed herein. 

FIG. 4 shows the process of the (consumer) Master Referring to FIG. 1, there is shown an exemplary web 

Applet, DTS Applets, and SessionID Applet being invoked, 30 pagc traddng ^ synching system 10 fj, in accordance with 

in response to loading a subsequent web page, to perform the mc present invention. 

operations in accordance with the present invention, when A . „ Ti ^ ' , , . xr t . 

af a i . u u i a i a a a u a As shown in FIG. 1, the system includes N terminals 

irTterSd ^ downloaded ■ nd cach6d 104A, 104B 104N; a network 129 such as the Internet, 

rs or a combination of the Internet and an Intranet and a web 

FIG. 5 shows the process of the (consumer) Master site m Each of the terminals has a telephone set or an 

Applet, DTS Applets, and SessionID Applet being invoked, alternate meaQS of comrnunicat i ons SU ch as an ip phone, chat 

in response to loading a subsequent web page, to perform the appucatiorjj etc . y i 02 A, 102B, . . , , 102N installed in its 

operations in accordance with the present invention, when vicinity. Each of the terminals can be a PC computer, a 

both these Applets and the web page have been previously workstatiorij a Java slatioDj or even a web TV system, 

downloaded and cached in a terminal. ^ ^ ^ includes a WTS ^ Tracking and 

FIG. 6 shows a session table in greater detail, in accor- Synching) gateway 142, a WTS server 144 containing a 

dance with the present invention. session table 145 and a user cla&s table 147 a database 

FIG. 7 shows how an agent (or supervisor) can create a processing application 148, an HTTP (Hyper Text Transfer 

session interface by downloading an agent page (or a 45 Protocol) server 152, and a hard disk unit 154 for storing 

supervisor page) from administration page repository 149, in consumer page repository 146, administration page reposi- 

accordance with the present invention. tory 149, and database 156. All the components in web site 

FIG. 8A shows an agent session interface, in accordance 134 can be installed in one or distributed across one or more 

with the' present invention. computer systems. Each of the computer systems includes a 

FIG. 8B shows a browser supervisor session interface, in 50 processing unit (which may include a plurality of 

accordance with the current invention. processors), a memory device, and a disk unit (which may 

FIG. 8C shows a supervisor agent session interface, in include a of disk sets >' 

accordance with the present invention. Each of the terminals 104A, 104B, . . . ,104N, includes a 

FIG. 9 shows a flowchart illustrating the operation of ?^ ssor un * S <| 0WD > aD T d a ar u ea 115A > 

joining a session by an agent, in accordance with the present 55 ^B, ■ • • , 115N > aD A ™ s a Java enabIed web browser 

invention 114A » 114B ' * * * ,114N * Each of lhe memor y area U5A > 

tA . ... , • • . u 115B, . . . , 115N, is maintained by its respective browser 

FIG. 10 shows a screen display containing two browse m mN Mft netwQrk ^ ^ f ^ 

instances, in accordance with the present invention. ^ mB mN fa ab]e (o ^ iQ 

FIG. 11 shows a flowchart illustrating the operation of 60 and receive web pages from HTTP server 152, and to display 

web page synchronization, in accordance with the present me we5 pages received at its respective terminal. Each of the 

invention. browsers 114A, U4B, . . . ,114N, is able to run a Master 

FIG. 12Ashows a web page containing 6ve data fields, in Applet 124A, 124B, . . . ,124N, a set of DTS (Data Tracking 

accordance with the present invention. aac ] Synching) Applets, a SessionID Applet, and an Agent 

FIG. 12B shows a web page that is similar to that of FIG. 65 Applet. As shown in FIG. 1, these Applets are stored in 

12 A, except that the data in one of the five data fields is consumer page repository 146 and can be downloaded from 

changed, in accordance with the present invention. consumer page repository 146 and stored in the memory 
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areas (otherwise known as browser cache) of the terminals 
104A, 104B, . . . , 104N. 

Referring to FIG. 2, there is shown the situation where 
each of the terminals 104A, 104B, . . . ,104N, has down- 
loaded its respective Master Applets 124A, 124B, . . . ,124N, 

DTS Applets 126A, 126B 128N, and SessionID 

Applets 128A, 128B, .... 128N, in accordance with the 
present invention. 

[n FIG. 2, each of the (consumer) Master Applets 124A, 
124B, . . . ,124N, is primarily responsible for: 
(l)establishing a socket connection to WTS gateway 142 via 
network 129 for its respective browser U4A, 11 4B, . . . 
,U4N, in response to loading each web page at its respective 
browser, (2) communicating with WTS server 144 via the 
socket connection, from which WTS server 144 is able to 
identify the origin, i.e., which browser, which web page, 
etc., of the commands and information that are being deliv- 
ered through, (3) monitoring the activities of its respective 
browser, (4) sending the information about its respective 
browser's activities to WTS server 144, (5) receiving and 
processing the information about other browsers* activities, 
(6) via the socket connection, providing a single communi- 
cation path to WTS server 144 for DTS Applets 126A, 126B, 
.... 126N, SessionID Applets 128A, 128B, . . . , 128N, or 
any other consumer Applets embedded on the same page 
with the Master Applet, (7) sending commands to WTS 
server 144 to request services, for itself and for DTS Applets 
126A, 126B, . . . , 126N, SessionID Applets 128A, 128B, . 
. . , 128N) , or any other consumer Applets embedded on the 
same page with the Master Applet, and (8) sending user class 
information together with the commands, to indicate that its 
respective browser is a consumer user. 

Each set of DTS Applets 126A, 126B, . . . , 126N, contains 
one or more individual DTS Applets, which are primarily 
responsible for: (1) displaying and monitoring the data 
activities (data inputs or data updates of data fields) on web 
pages that are being displayed by its respective browser, (2) 
sending the data activities to WTS server 144 via its respec- 
tive Master Applet, (3) receiving the data activities from 
other browsers via its respective Master Applet, and (4) 
processing the data activities from other browsers for the 
web pages that are being displayed by its respective browser. 

Each of the SessionID Applets 128A, 128B 128N, 

is responsible for retrieving, and for displaying on a web 
page the current SessionID. 

As shown in administration page repository 149, Agent 
Applet (or Supervisor Applet) is responsible for creating a 
session interface, joining, monitoring, and controlling a 
session through the session interface. The (administration) 
Master Applet is primarily responsible for: (1) opening a 
dedicated socket, and establishing a socket connection to 
WTS gateway 142 via network 129 for the session interface 
created by Agent Applet, Supervisor Applet, or any other 
administration Applets embedded on the same web page 
with the Master Applet, (2) communicating with WTS server 
144 via the socket connection, from which WTS server 144 
is able to identify the origin (i.e. from which session 
interface) of the commands and information that are being 
delivered through, (3) via the socket connection, providing 
a single communication path to WTS server 144 for Agent 
Applet, Supervisor Applet, or any other administration 
Applets embedded on the same web page with the Master 
Applet, and (4) sending user class information together with 
the commands, to indicate that its respective browser is an 
administration user. 

WTS gateway 142 is responsible for maintaining all 
socket connections between Master Applets and WTS server 



8,471 Bl 

6 

144. The connections between Master Applets and WTS 
gateway 142 take place using standard sockets. In the 
implementation described herein, the connection between 
WTS gateway 142 and WTS server 144 takes place using 

5 Java RMI (Remote Method Invocation). This connection 
could also be accomplished utilizing sockets. 

WTS server 144 is responsible for (1) managing and 
tracking the activities of all browsers participating in. active 
sessions, exemplary activities including: loading of, inter- 

10 acting with, and unloading of web pages, (2) recording the' 
information about the activities, (3) managing the synchro- 
nization of the activities for alt browsers participating in the 
active sessions, (4) creating a session when a consumer user 
(via a browser) sends a request to web site 134 for the first 

15 time, (5) defining th e session leng th mtervals. (6) purging 
sessions that have bcen' inactive for more than the specified 
session length intervals, (7) adding and deleting participants 
to a session, and (8) providing services to all commands 
from consumer Applets, such as: (consumer) Master Applet, 

20 DTS, Applets, SessionID Applets, and administration 
Applets, such as: (administration) Master Applet, Agent 
Applets, and Supervisor Applet. 

Consumer page repository 146 stores the web pages and 
Applets for consumers. Consumer Applets can be selectively 

25 embedded into consumer web pages. Exemplary consumer 
Applets include: (consumer) Master Applet, DTS Applets, 
SessionID Applet, etc. 

Administration page repository 149 stores the web pages 
and Applets for call center administration users, including: 
administrator, supervisor, agent, etc. Administration Applets 
can be selectively embedded into administration web pages. 
Exemplary administration Applets include: (administration) 
Master Applet, Agent Applet, Supervisor Applet, etc. 

35 To better describe the present invention, the Applets 
stored in (or downloaded from) repository 146 can be 
referred to as consumer Applets, and the Applets stored in 
(or downloaded from) repository 149 can be referred to as 
administration Applets. HTTP server 152 contains a security 

40 application that allows consumer users to get access only to 
the web pages stored in consumer page repository 146, and 
allows administration users (such as administrator, 
supervisor, agent, etc.) to get access to the web pages stored 
in both consumer page repository 146 and administration 

45 page repository 149. Note that because an agent can access 
consumer applets, there is no need to have separate master 
applets stored in administration and consumer repositories. 
A single master applet could be utilized to perform both 
purposes. 

50 Session table 145 is responsible for maintaining the 
information for all active sessions. 

Class table 147 is responsible for keeping records of user 
classes or types assigned to different users. Listed are 
exemplary user types: administrator, supervisor, agent, and 

55 consumer. 

Based on user classes (administrator, supervisor, agent, 
and consumer), WTS server 144 provides the following 
services: 

(1) creating a session (consumer); 
60 (2) storing data received from a session participant 
(supervisor, agent, and consumer); 

(3) listing active sessions (administrator and supervisor); 

(4) listing the information associated with active sessions 
(administrator, and supervisor); 

65 (5) listing current users (administrator); 

(6) joining a session (supervisor and agent); 

(7) terminating a session (supervisor); 
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(8) monitoring a session (supervisor and agent); 

(9) configuring a session parameters (administrator); and 

(10) sending commands and information to a consumer 
Master Applet or an administration Master Applet in a 
participating browser (supervisor, agent, and consumer). 
Database 156 is responsible for storing data collected in 

session table 145. 

HTTP server 152 is responsible for processing the 
requests issued by one of the web browsers, retrieving the 
web pages from either consumer page repository 146 or 
administration page repository 149, and sending the web 
pages to the browsers that have generated these requests. 

Database processing application 148 is responsible for 
writing the data collected in session table 145 into database 
156. 

Referring to FIG. 3, there is shown the process of how the 
(consumer) Master Applet, DTS Applets, and SessionID 
Applet being downloaded into terminal 104A from HTTP 
server 152 in response to loading an initial web page, and 
then being invoked to perform the operations in accordance 
with the present invention. 

As shown in FIG. 3, a (consumer) Master Applet, a set of 
DTS Applets, and a SessionID Applet are embedded into 
web page 204 by using a set of applet tags 208. Web page 
204 is associated with a specific URL indicaling the location 
of web page 204 in HTTP server 152. 

As indicated by dotted line (1), web browser 114A sends 
a request including the URL' of web page 204 to HTTP 
server 152 via network 129 (shown in FIG. 2). As indicated 
by dotted line (2), in response to the request, HTTP server 
152 retrieves web page 204 from consumer page repository 
146 and sends it to web browser 114A via network 129. Web 
page 204 contains a set of applet tags 208, which indicate the 
location of Master Applet, DTS Applets, and SessionID 
Applet in HTTP server 152. As indicated by dotted line (3), 
web browser 114A loads web page 204. As indicated by 
dotted line (4), since Master Applet, DTS Applets, and 
SessionID Applet have not been downloaded, web browser 
114A sends requests via network 129, to download these 
Applets based on applet tags 208. As indicated by dotted line 
(5), HTTP server 152 sends Master Applet, DTS Applets, 
and SessionID Applet to browser 114A via network 129. As 
indicated by dotted line (6), browser 114A stores Master 
Applet 124A, DTS Applets 126A, and SessionID Applet 
128A into memory area 115A of terminal 104A, and initial- 
izes and invokes these Applets. After being invoked, these 
Applets are running together with web browser 114A, to 
monitor and process the activities for .which they are 
assigned to be responsible. As indicated^by line (7), Master 
Applet 124A opens a . dedicated socket and establishes a 
socket connection to WTS gateway . 142 for browser 114A 
and web page 204. Via the socket connection, Master Applet 
126 sends WTS server 144 a command, together with an ID 
unique fb browser 1 14A. In response to the command from 
Master Applet 126, WTS server 144 creates a session for 55 
browser 114A based on the unique ID, and jssues a time 
stamp (loading time) indicating the time at which the 
command was received, and stores the URL and tune^stamp^ 
of web page 204 into the session created for browser 114. As 
will be seen in the description in connection with FIG. 6 60 
following, the URL, command, and loading time are stored 
in a URL history list and a ' command 1 list created for the 
session. 

Referring to FIG. 4, there is shown the process of the 
(consumer) Master Applet 124 A, DTS Applets 126 A, and 
SessionID Applet 128 A being invoked, in response to load- 
ing a subsequent web page 214 (subsequent to web page 
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204), to perform the operations in accordance with the 
present invention, when these Applets have been previously 
downloaded and cached in terminal 104 A. 

As indicated by dotted line (1), to download web page 
214, web browser 114A sends a request including the URL 
of web page 214 to HTTP' server 152 via network 129. 
Before loading web page 214, the following event's occur: 
(a) browser 11 4A instructs Master Applet 124A to run a stop 
routine, (b) via the socket connection established for 
browser 114 A and web page 204, Master Applet 124 sends 
a command to inform WTS server 144 that web page 204 has 
been unloaded, and disconnects the socket connection estab- 
lished for browser 114A and web page 204, (c) WTS server 
144 issues a time stamp (unloading time) indicating the time 
the command was received,. and'(d) records the URL and the 
time stamp of web page 204 into the session created for 
browser 11 4A. As-shown in FIG. 6 and-described in greater 
detail below, the URL, command, and unloading time are 
stored in a URL history list and a command list created for 
the session. As indicated by dotted line (2), HTTP server 152 
retrieves web page 214 from consumer page repository 146 
and sends it to browser 114A. Like web page 204, web page 
214 contains a set" of applet tags ,208 for indicating the 
location of Master Applet 124A, DTS Applets 126 A, and 
SessionID Applet 128 A. As indicated by dotted line (3), web 
browser 114A loads web page 214. As indicated by dotted 
line (4), in response to the loading of web page 214, web 
browser 114 A locates Master Applet 124A, DTS Applets 
126A, and SessionID Applet 128A (based on the indication 
of applet tags 208) that are cached by browser 11 4A in 
memory area 115 A, and loads these Applets and then 
invokes them. As indicated by line (5), Master Applet 124A 
opens a dedicated socket and establishes a socket connection 
to WTS gateway 142 for browser 114A and web page 214. 
Via the socket connection established for browser 114A and 
web page 214, Master Applet 126 A sends a command, 
together with the ID unique to browser 114A and the URL 
of web page 214, to inform WTS server 144 that web page 
214 has been loaded. WTS server 144 issues a time stamp 
(loading time) indicating the time the command was 
received and stores the URL and time stamp of web page 
into the session created for browser 114A. shown in FIG. 6 
and described in greater detail below, the URL, command, 
and loading time are stored in a URL history list and a 
command list created for the session. 

Referring to FIG. 5, there is shown the process of the 
(consumer) Master Applet 124A, DTS Applets 12 6A, and 
SessionID Applet 128 A being invoked, in response to load- 
ing web page 204, to perform the operations in accordance 
with the present invention, when both these Applets and web 
page 204 have been previously downloaded and cached by 
browser 114A in terminal 104A. 

As indicated by dotted line (1), web browser 114A loads 
web page 204 cached in memory area 115A maintained by 
browser 114A. As described earlier, web page 224 contains 
a set of applet tags 208 indicating the location of Master 
Applet 124A, DTS Applets 12 6A, and SessionID Applet 
128A. Before loading web page 224, the following events 
occur: (a) browser 114A instructs Master Applet 124A to run 
a stop routine, (b) Master Applet 124Asends a command via 
the socket connection established for browser 114A and web 
page 214A to inform WTS server 144 that web page 214 has 
been unloaded, and disconnects the socket connection estab- 
lished for browser 114 A and web page 214, (c) WTS server 
144 issues a time stamp (unloading time) indicating the time 
the command was received, and (d) WTS server 144 records 
the URL and time stamp of web page 214 into the session 
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created for browser U4A. As will be seen in the description 
in connection with FIG. 6, the URL, command, and unload- 
ing time are stored in a URL history list and a command list 
created for the session. As indicated by dotted line (2), 
browser 114A in response to the loading of web page 224, 5 
locates Master Applet 124A, DTS Applets 126A, Session© 
Applet 128A that have been cached by browser 114A in 
memory area 115A in terminal 104 A, and initializes and 
invokes these Applets. As indicated by line (3), Master 
Applet 124 A opens a dedicated socket and establishes a 10 
socket connection to WTS gateway 142 for browser 114A 
and web page 224. Via the socket connection established for 
browser 11 4A and web page 224, Master Applet 126A sends 
a command, together with the ID unique to browser 114A 
and the URL of web page 224, to inform WTS server 144 15 
that web page 224 has been loaded. WTS server 144 issues 
a time stamp (loading time) indicating the time the com- 
mand was received and stores the URL and time stamp into 
the session created for browser 114. As will be seen in the 
description in connection with FIG. 6, the URL, command, 20 
and loading time are stored in a URL history list and a 
command list created for the session. 

In the example shown in FIG. 5, it should be appreciated 
that even through no request arrives at HTTP server 144 
when web page 204 is loaded from cached memory in 25 
terminal 104 A, Master Applet 124A still sends browsing 
activities to WTS server 144. 

It should be noted that the processes shown in FIGS. 3-5 
of loading and invoking Master Applet, DTS Applets, and 
SessionID Applet for terminal 104A can also be used for 30 
terminals 104B through 104N. 

In FIGS. 3-5, Master Applet, DTS Applets, and Ses- 
sionID Applet are all embedded into web pages 204 and 214. 
However, it should be noted that there is no requirement that 
all of these Applets be embedded into a web page. Depend- 35 
ing on the desired functions to be performed, respective 
Applets can be selectively embedded into a web page by 
selectively setting applet tags in the web page. For example, 
if data synchronization and tracking of individual elements 
are not needed, the applet tags for linking DTS Applets can 40 
be eliminated from the web page. By the same token, if 
additional functions are needed, additional applet tags can 
be added into the web page to link additional Applets. 

Referring to FIG. 6, there is shown session table 145 (see 
FIG. 1) in greater detail, in accordance with the present 45 
invention. 

While browsers at their respective terminals are browsing 
through the web pages in web site 134, WTS server 144 
collects and analyzes the information about the interactions 
between all browsers and the web pages that have been 50 
downloaded to the browsers from web site 134. One diffi- 
culty in collecting and analyzing such information is that 
browsing individual web pages in web site 134 is a stateless 
process. More specifically, web site 134 receives a sequence 
of requests from different browsers, and sends the respective 55 
web pages to the respective browsers in response to the 
sequence requests. Since in processing the requests from an 
individual browser, web site 134 does maintain a constant 
connection to the same browser to keep an one-to-one 
relationship, web site 134 has no control over, and does not 60 
maintain data on, the sequences of the requests from the 
browsers. 

To meaningfully collect and analyze the information 
about the interactions between the browsers and web pages, 
a session is defined as a collection of web page interactions 65 
that occur over a given period of time from a specific 
browser. A session is created when a browser first hits web 
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site 134, and a session window (or session length interval) 
is defined for the session. If activities from a specific 
browser (identified by an ID unique to the browser, issued by 
a respective Master Applet) does not occur within the 
session window, the session is terminated and cleaned up by 
WTS server 144. A session window is refreshed (reset to 
time zero) each time the information about the associated 
browser is sent to WTS server 144. For example, if a session 
window is defined as 15 minutes, as long as the associated 
terminal has some activity every 15 minutes, the session will 
remain open. After 15 minutes of inactivity, the session is 
terminated. A subsequent request from the same terminal 
will cause a new session to be created. After a session has 
been created for a terminal, one or more other terminals can 
join the session. 

As shown in FIG. 6, session table 145 includes M Session 
IDs created for M sessions respectively. Each of the session 
ID is associated with: (1) a session list for maintaining 
information about a session, (2) a participant list for main- 
taining information about all participant browsers in a 
session (note: when a session is first created, it only contains 
one participant), (3) a URL history list for maintaining 
information about all web pages visited by all participants in 
a session, (4) a data list for maintaining information about 
the data fields on the web pages visited by all participants in 
a session, and (5) a command list for maintaining informa- 
tion about all commands issued to WTS server 144 by the 
various participants in a session. 

Typical items in a session list are: (1) SessionID for 
identifying a session, (2) UserName for indicating the actual 
name for whom the session is created, (3) StartTime for 
indicating the time of starting the session, (4) StopTime for 
indicating the time of stopping the session, and (5) Session- 
Notes for recording the notes of the session. 

Typical fields contained in a participant list are: (1) 
SessionID for linking the participant list to a session, (2) 
ParticipantID for identifying a participant, (3) Participan- 
tAddresses for indicating a participant's IP address, (4) 
Class for indicating the user class of the participant 
(customer, agent, supervisor, administrator, etc.) and (5) 
Direction for indicating the synchronization direction for the 
participant browser. 

Typical fields contained in a URL history list are: (1) 
SessionID for Unking the URL history list to a session, (2) 
ParticipantID for identifying a participant who visited the 
web page, (3) LoadingTime for indicating the loading time 
of the web page, and (4) UnloadingTime for indicating the 
unloading time of the web page, and (5) PageURL for 
indicating the URL of a web page visited. 

Typical fields contained in a data list are: (1) SessionID 
for linking the data list to a session, (2) ParticipantID for 
indicating the participant browser who updated this data 
field, (3) FieldName for indicating the actual name of the 
data field, (4) DataName for indicating the name of the data 
field displayed on a web page, (5) Data Value for indicating 
the value of the data field, (5) Page URL for indicating the 
web page on which the data field was displayed, (6) TimeS- 
tamp for indicating the time at which this data field is 
updated, and (7) WasRelayed for indicating if this data field 
has been broadcasted. 

Typical fields contained in a command list are: (1) Ses- 
sionID for linking the data list to a session, (2) ParticipantID 
for indicating the participant browser issuing the command, 
(3) Command for indicating the specific command executed 
(loading a page, unloading a page, changing a data field, 
etc.), (4) Page URL for indicating the web page to which the 
command operated, (5) FieldPoint for indicating the data 
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field to which ihe command operated, and (6) TimeStamp 
for indicating the time at which command was executed. 

Before a session is purged from session table 145, data- 
base processing application 147 stores the associated session 
list, URL history list, and command list to database 156. The 
data contained in these three lists can be used by data 
warehouse integration applications. 

Referring to FIG. 7, there is shown an operation for 
creating a session interface for an agent (or a supervisor) by 
downloading an agent page (or a supervisor page) from 
administration page repository 149, in accordance with the 
present invention. In the example shown in FIG. 7, it is 
assumed that administration user class (either agent user 
class or supervisor user class ) is assigned to terminal 104N, 
so that the security application in HTTP server 152 grants 
the access to the web page stored in both consumer page 
repository 146 and administration page repository 149. 

At step 702, an agent at terminal 144N types in an agent 
URL at terminal 104N, and browser 114N sends the URL to 
HTTP server 152, to retrieve an agent page, in which an 
(administration) Master Applet and an Agent Applet are 
embedded. For a supervisor, he/she types in a supervisor 
URL at terminal 104N, and browser 114N sends the URL to 
HTTP server 152, to retrieve a supervisor page, in which an 
(administration) Master Applet and a Supervisor Applet are 
embedded. 

At step 704, HTTP server 152 retrieves the agent page (or 
a supervisor page) from administration page repository 149 
and sends it to browser 114N. 

At step 706, browser 114N downloads the agent page, in 
which a Master Applet (administration Master Applet) and 
an Agent Applet are embedded; or downloads the supervisor 
page, in which a Master Applet (administration Master 
Applet) and a Supervisor Applet are embedded. 

At step 708, browser 114N downloads the Master Applet 
and Agent Applet from HTTP server 152, initializes and 
invokes these Applets; or downloads the Master Applet and 
Supervisor Applet from HTTP server 152, initializes and 
invokes these Applets. 

At step 710, Master Applet opens a dedicated socket, 
establishes a socket connection to WTS gateway 142, and 
sends an unique ID to WTS server 144, WTS server 144 is 
able to identify browser 11 4N based on the unique ID. 

At step 712, Agent Applet creates an agent session inter- 
face 800A shown in FIG. 8 A for the agent user, or Super- 
visor Applet creates a supervisor session interface 800B 
shown in FIG. 8B for the supervisor agent. 

Referring to FIG. 8 A, there is shown an agent session 
interface 800A created for an agent at step 712, in accor- 
dance with the present invention. 

As shown in FIG. 8A, the session interface contains a text 
box 804 for entering a session ID, a Join session button 806 
for joining a session identified by the session ID, a drop 
button 808 for leaving a session, a leader check box 810 
(selecting of which designates a browser as a leading 
browser in synchronization), a follower check box 812 
(selecting of which designates a browser as a following 
browser in synchronization), a scrollable list box 816 for 
displaying the information contained in the participant list 
associated with a selected session, a scrollable list box 818 
for displaying the information in an identified URL history 
list, and a Display URL Button 820 for displaying URLS 
selected in URL History list 818. If both the leader and 
follower check boxes 810 and 812 are selected in the agent 
session interface, browser 11 4A acts as both leading and 
following browser in synchronization. 

Referring to FIG. 8B, there is shown a browser supervisor 
session interface 800B created for a supervisor at step 712, 
in accordance with the current invention. 
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As shown in FIG. 8B, the session interface contains a 
scrollable list box 832 for displaying session IDs of all active 
sessions in session table 145 and for selecting one of the 
session IDs, a text box 834 for displaying relevant statistics 

5 of WTS server 144, a multi column scrollable list box 836 
for displaying details about the session selected in scrollable 
list box 832, a select session button 838 for selecting a 
session from scrollable list box 832. By using the informa- 
tion in scrollable list box 832, a supervisor agent can 

to monitor all active sessions. By using the information in 
multi column scrollable list box 836, a supervisor can 
monitor operational status of a session selected from scrol- 
lable list box 832, including: (1) whether this session is 
being helped by an agent, (2) user name, and (3) agent ID. 

15 By selecting select session button 838, a supervisor can 
access an agent session interface as shown in FIG. 8C. 

Referring to FIG. 8C, there is shown a supervisor agent 
session interface 800C, in accordance with the present 
invention. 

20 Referring to FIG. 9, there is shown a flowchart illustrating 
the operation of joining a session by an agent, in accordance 
with the present invention. 

In the example shown in FIG. 9, it is assumed that: (1) a 
consumer at terminal 104A is browsing web pages from 

25 consumer page repository 146 via browser 114A, (2) session 
list 1 shown in FIG. 6 has been created for browser 114 A, 
(3) an agent class has been assigned to browser 114N, (4) 
agent session interface 800A shown in FIG. 8A has been 
displayed on terminal 104N; (5) a (administration) Master 

30 Applet and Agent Applet have been previously downloaded 
into browser 114N, (6) a dedicated socket connection has 
been established for session interface 800A displayed at 
terminal 104N by the (administration) Master Applet, and 
(7) the agent at terminal 104A is on duty at a call center. 

35 As shown in FIG. 9, at step 902, the consumer is browsing 
a web page at terminal 104A. On the web page, SessionlD 
Applet 128 A displays the current session ID. A call center 
telephone number the consumer can call is also displayed on 
the web page. 

40 At step 904, the consumer is connected to the call center 
by dialing the telephone number via telephone 102 A (see 
FIG. 1), and the call is directed by the call center to the 
agent. 

At step 906, the consumer tells, via telephone 102A (see 

45 FIG. 1), the agent the current session ID displayed. It should 
be noted that, instead of using the telephone, the agent can 
be informed of the current session ID by alternative meth- 
ods. For example, the consumer can enter his/her telephone 
number into a special web page that contains a customer ID 

50 identifying the consumer along with the current session ID. 
This information can be stored into a special lookup table 
that can be used by the agent to identify the customer. At step 
908, at terminal 104N, the agent types the current session ID 
into text box 804 (see FIG. 8A). 

55 At step 910, in response to a loss of focus or a pressing 
of the Enter key, the (administration) Master Applet at 
terminal 104N sends a command to WTS server 144, to 
retrieve the information in participant list 1, URL history list 
1, and data list 1 (see FIG. 6) for the Agent Applet. 

60 At step 912, WTS server 144 sends the information 
requested to the Agent Applet (via the Master Applet). 

At step 914, the Agent Applet at terminal 104N displays 
some information from participant list 1 and URL history list 
1 in (participant) scrollable list box 816 and (URL history) 

65 scrollable list box 818, respectively. 

At step 916, the agent selects join button 806 in agent 
session interface 800A displayed on terminal 104N. 
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At step 918, in response to the selection al step 916, 
through the socket connection which has been established 
for agent session interface displayed on terminal 104N, the 
(administration) Master Applet sends WTS server 144 a 
command to join the selected session. Based on the identi- 
fication associated with the socket connection, WTS server 
is able to generate a Participant! D for browser 114N and to 
find the ParticipantAddress for terminal 104N. 

At step 920, WTS server 144 stores the ParticipantlD and 
ParticipantAddress into participant list 1. At this step, par- 
ticipant list 1 includes two participant records (two rows) 
containing the PaticipantlDs for browsers 114A and 114N 
respectively. 

At step 922, at terminal 104N, the agent selects: leading 
check box 810 or following check box 812, or both of them. 
By only selecting leader check box 810, the activities at 
terminal 104N are synchronized at terminal 104A, but not 
other way around. By only selecting follower check box 
812, the activities at terminal 104A are synchronized at 
terminal 104N, but not other way around. By selecting both 
leader and follower check boxes 810 and 812, the activities 
at terminals 104A and 104N are synchronized with each 
other (bi-directional synchronization). In response to the 
selection(s), through the socket connection which has been 
established for agent session interface 800A, the 
(administration) Master Applet sends WTS server 144 a 
command designating the synchronization direction. WTS 
server 144 stores the synchronization direction information 
into the Direction fields of the two records in participation 
list 1. In this example, it is assumed that the bi-directional 
synchronization has been selected for terminals 104A and 
104N. 

At step 924, WTS server 144 sends the (administration) 
Master Applet the URL of the web page being currently 
browsed at terminal 104A. 

At step 926, the Agent Applet at terminal 104N opens a 
browser window 1004 (a second browser instance) as shown 
in FIG. 10. 

At step 928, browser 114N downloads the web page 
identified by the URL from consumer page repository 146, 
and displays it in browser window 1004. A (consumer) 
Master Applet, a set of DTS Applets, and a Session! D Applet 
are embedded in the web page downloaded. 

At step 930, browser 11 4N downloads (consumer) Master 
Applet 124N, set of DTS Applets 126N, and SessionID 
Applet 128N. 

At step 932, the web pages displayed in second browser 
window 1004 at terminal 104N are being synchronized with 
the web pages being displayed at terminal 104A. 

After step 932, if the agent (the first agent) at terminal 
104A needs assistance from another agent (the second agent) 
at terminal 104K, the first agent can call the second agent 
and tell him/her the current session ID. The second agent can 
then join the current session using an agent session interface 
as shown in FIG. 8 A displayed at terminal 104K. 

Referring to FIG. 10, there is shown a screen display 
containing two browser instances (800A and 1004) at ter- 
minal 104N, in accordance with the present invention. 

As shown in FIG. 10, at terminal 104N, the first browser 
instance provides an agent session interface 800A to control 
and monitor the current session, and the (administration) 
Master Applet for agent session interface 800 A establishes 
and maintains a socket connection for agent session inter- 
face 800A. The second browser instance provides a browser 
window 1004 to display the web pages being synchronized. 
(Consumer) Master Applet 124N establishes and maintains 
a socket connection for each web page displayed in browser 
window 1004. 
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Referring to FIG. 11, there is shown a flowchart illustrat- 
ing the operation of web page synchronization, in accor- 
dance with the present invention. 

In the example shown in FIG. 11, it is assumed that: (1) 
s a consumer at terminal 104A is browsing web pages from 
consumer page repository 146 via browser 114 A, (2) a 
session has been created for browser 11 4A, (3) session list 
1 and participant list 1 shown in FIG. 6 have been created 
for the session, (4) bi-directional synchronization has been 
10 selected for terminal 104A and all participant terminals, and 
(5) the (consumer) Master Applet, DTS Applets, and Ses- 
sionID Applet have been downloaded into browser 104A 
and all participant browsers. 

As shown in FIG. 11, al step 1104, browser 11 4A loads a 
web page either from consumer page repository 146 or from 
15 memory area 11 5 A in terminal 104 A. If Master Applet 
124A, DTS Applets 126A, and SessionID Applet 128A had 
not been download to browser 114A, browser 114A would 
download these Applets from consumer page repository 146. 
However, in this example, these Applets are assumed to be 
20 downloaded. 

At step 1106, in response to the loading of the web page, 
browser 11 4A initializes and invokes Master Applet 124A, 
DTS Applets 126A, and SessionID Applet 128A. 

At step 1108, Master Applet 124A: (1) opens a dedicated 
25 socket, and establishes a socket connection to WTS gateway 
142 for browser 114A and the web page loaded, and (2) via 
the socket connection, sends WTS server 144 a command 
together with an ID unique to browser 11 4A and the URL of 
the web page loaded. Based on the unique ID, WTS server 
30 is able to identify the session created for browser 11 4A. 

At step 1110, WTS server 144 identifies the session for 
browser 114A. 

At step 1112, WTS server 144 locates all IP addresses 
assigned to participant terminals in participant list 1 (shown 
35 in FIG. 6), and sends a command, together with the URL, to 
all the participant terminals (except that WTS server 144 
does not sent the URL to terminal 104 A, because the URL 
is originated from terminal 104A). 

At step 1114, upon receiving the command, the 
40 (consumer) Master Applets in the participant terminals ini- 
tialize themselves and pass the URL to their respective 
browsers. 

At step 1116, the respective browsers in the participant 
terminals download and display the web page according to 
45 the URL. 

It should be noted that, like terminal 104 A, each of the 
participant terminals (at which agent session interface is 
displayed) can lead the page synchronization using the 
operation shown in FIG. 11. 

50 Referring to FIG. 12 A, there is shown a web page 1200 
containing five data fields, specifically: name 1202, time 
period 1204, account balance 1206, payment 1208, com- 
ments 1210, a text box 1212 for displaying the current 
session ID, and a text box for displaying the call center 

55 number the consumer can call, in accordance with the 
present invention. 

Referring to FIG. 12B, there is shown a web page that is 
similar to that of FIG. 12 A, except that the data in the name 
field 202 has been changed from Susan King to Sue Grant, 

60 and a comment indicating that the name contained in field 
202 has been changed is provided in comments field 1210. 
These changes are also displayed in a web page 1200' 
synchronized at a participant terminal, in accordance with 
the present invention. 

65 Referring to FIG. 13, there is shown a flowchart illustrat- 
ing the operation of data synchronization, in accordance 
with the present invention. 
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In the example shown in FIG. 13, it is assumed that: (1) 
a customer at terminal 104A is browsing web pages via 
browser U4A, (2) a session has been created for terminal 
104A, (3) session list 1 and participant list 1 shown in FIG. 
6 has been created for the session, (4) terminal 104N is one 
of the participants, (5) web page 1200 containing five data 
fields shown in FIG. 12A is displayed on terminals 104A and 
all participant terminals, (6) a bi-directional synchronization 
has been selected for terminal 104A and all participant 
terminals, (7) the (consumer) Master Applet, DTS Applets, 
and SessionID Applet have been downloaded to browser 
114A and all participant browser, (8) the DTS Applets 
contains five individual Applets: DTS Applet^ DTS 
Applet^ DTS Applet 3 , DTS Applets,,, and DTS Applel 5 , (9) 
these five individual DTS Applets are respectively respon- 
sible for monitoring and processing the events occurred on 
the five data fields of web page 1200 shown in FIG. 12A, 
(10) (consumer) Master Applet 124A has established a 
dedicated socket connection to WTS gateway 142 for web 
page 12 A displayed at terminal 104A, and (11) the customer 
at terminal 104A wants to make changes to name field 1202 
from Susan King to Sue Grant. 

As shown in FIG. 13, at step 1304, the customer changes 
the name in name field 1202 from Susan King to Sue Grant. 

At step 1306, in response to a loss of focus on name field 
1202 or pressing the Enter key, DTS Applet, detects the 
change and passes the change to Master Applet 124A. Note 
that the change of the form field value could also be detected 
through a proprietary interface between the master applet 
and the web browser. 



104 A and all participant terminals, (3) session list 1, par- 
ticipant list 1, and URL history list 1 shown in FIG. 6 have 
been created for the session, (4) bi-directional synchroniza- 
tion has been selected for terminal 104A and all participant 
terminals, and (5) the (consumer) Master Applet, DTS 
Applets, and SessionID Applet have been downloaded into 
terminals 104A and all participant terminals. 

As shown in FIG. 14, at step 1404, browser 11 4A down- 
loads a web page from either the consumer page repository 
146 or memory area 115A of terminal 104A. If Master 
Applet 124A, DTS Applets 126A, and SessionID Applet 
128 A had not been download to terminal 104 A, browser 
114A would download them from HTTP server 152. 
However, in this example, these Applets have been down- 
loaded. 

At step 1406, web browser 114A initializes and invokes 
Master Applet 124A, DTS Applets 126A, and SessionID 
Applet 128A. 

At step 1408, Master Applet 124 A opens a dedicated 
socket and establishes a socket connection to WTS gateway 
142 for web browser 114Aand the web page loaded. Master 
Applet 124A then sends WTS server 144 a command, 
together with: (1) an ID unique to browser 114A, and (2) the 
URL of the web page loaded. When commands and URL are 
delivered through this socket connection, WTS server 144 is 
able to recognize the origin of the commands and URL. 

At step 1410, WTS server 144 identifies the session ID for 
browser 114A. 

At step 1412, WTS server 144 locates the session list 1 
and URL history list 1. 
At step 1414, WTS server 144 issues a time stamp 
At step 1308, via the dedicated socket connection, Master 30 (loading time) for indicating the time at which the command 
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Applet 124A sends WTS server 144 a command specifying 
the change of name field 1202. Since this change is passed 
to WTS server 144 via the dedicated socket connection 
established for web page 1200, WTS server 144 is able to 
recognize the origin of the command, web page 1200, and 
the name field upon which the change was made. 

At step 1310, WTS server 144 identifies the session 
created for browser 11 4 A. 

At step 1312, WTS server 144 locates the IP addresses 
assigned to participant browsers in participant list 1 and 
sends a command (together with the change of name field 
1202) to the Master Applets in all other participant terminals 
(except that WTS server 144 does not send the command 
and change to browser 114A, since this change originated 
from browser 11 4A). 

At step 1314, upon receiving the command, the 
(consumer) Master Applets (including Master Applets 
124N) pass the change of name field 1200 to their respective 
DTS Applets, including the DTS Applet-, at browser 114N. 

At step 1316, the DTS Applet, display the update "Susan 
Grant" into the name fields on respective web page 1200 
displayed on the respective terminals, including terminal 
104N. 

It should be noted that the operation shown in FIG. 13 can 
be used to perform data synchronization for the other four 
data fields on web page 1200 shown in FIG. 12A. 

It should also be noted that the data field synchronization 
can also be performed at terminal 104N. For example, as 
shown in FIG. 12B, when the agent at terminal 104N enters 
comments of "Account's name had been changed" to com- 
ments field 1210' on web page 1200* , this updates will be 
displayed in comments field 1210 at terminal 104A, by using 
the operation shown in FIG. 13. 

Referring to FIG. 14, there is shown a flowchart illustrat- 
ing the operation of web page tracking, in accordance with 
the present invention. 

In the example shown in FIG. 14, it is assumed that: (1) 
a customer at terminal 104A is browsing web pages via 
browser 114A, (2) a session has been created for terminal 
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was received, and stores the URL and time stamp to URL 
history list 1. 

At step 1416, browser 114A sends WTS server 144 a 
request to load a subsequent web page. 

At step 1418, before loading the subsequent web page, via 
the socket connection, Master Applet 124 A sends WTS 
server 144 a command, together with the URL, to inform 
WTS server 144 that the current web page has been 
unloaded. 

At step 1420, WTS server 144 identifies the session for 
terminal 104A. 

At step 1422, WTS server 144 locates the session list 1 
and URL history list 1. 

At step 1424, WTS server 144 issues a time stamp 
(unloading time) for indicating the time at which the com- 
mand was received, and stores the URL and time stamp to 
URL history list 1. 

At step 1426, Master Applet 124A disconnects the socket 
connection for the web page that has been unloaded. 

Referring to FIG. 15, there is shown a flowchart illustrat- 
ing the operation of data tracking, in accordance with the 
present invention. 

In the example shown in FIG. 15, it is assumed that: (1) 
a customer at terminal 104 A is browsing web pages via 
browser 114 A, (2) a session has been created for terminal 
104 A, (3) session list 1 and participant list 1 shown in FIG, 
6 has been created for the session, (4) terminal 104N is one 
of the participants, (5) web page 1200 containing five data 
fields shown in FIG. 12 A is displayed on terminals 104Aand 
all participant terminals, (6) a bi-directional synchronization 
has been selected for terminal 104A and all participant 
terminals, (7) the (consumer) Master Applet, DTS Applets, 
and SessionID Applet are downloaded to terminal 104Aand 
all participant terminals, (8) the DTS Applets contains five 
individual Applets: DTS Applet^ DTS Applet 2 , DTS 
Applet 3 , DTS Applets 4 , and DTS Applet 5 , (9) these five 
individual DTS Applets are respectively responsible for 
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displaying, monitoring and processing the events occurred page previously browsed. If the user browser previously 

on the five data fields of web page 1200 shown in FIG. 12 A, performed any changes to the data fields of the web page, the 

(10) Master Applet 124A has established a dedicated socket user and agent Master Applets receive the final data values 

connection to WTS server 144 for web page 1200 displayed of the changes and cause each data filed to display these 

on terminal 104A, and (11) the customer at terminal 104A 5 changes on the web page. # 

wants to make changes to name field 1202 from Susan King Referring to FIG. 17, there is shown a flowchart illustrat- 

to Sue Grant m & tne °P eral * 0D °£ replaying a browsing session, in accor- 

As shown in FIG. 15, at step 1504, the customer changes dance with the present invention, 

the name in name field 1502 from Susan King to Sue Grant. Id the «W * h ^ m FIG.17, it * assumed that: (1) 

At step 1506, in response to a loss of focus on name field n a user al tenninal ™4A had browsed web pages from 

1202 or pressing the Enter key, DTS Applet, detects the 10 consumer page repository 146 via browser 114A, (2) a 

change and passes the change to Master Applet 124A. **f 10n list 1 shown 10 F1G - 6 was , c « a [«! for b r owser U4A j 

At step 1508, via the dedicated socket connection, Master ( 3 ) an a S ent * on dut Y at terminal 104N id a call center, and 

Applet 124A sends WTS server 144 a command together a S cnt class has been assigned to browser 104N, (4) at 

with the change of name field 1202. Since this change is browser 11 4N, the first and second browser instances for the 

passed to WTS server 144 via the dedicated socket connec- is agent as shown in FIG. 10 have been displayed, (5) via 

lion established for web page 1200, WTS server 144 is able respective socket connections established by Master Applet 

to recognize the origin of the command, web page 1200, and 114N, the first and second browser instances for the agent as 

the name field upon which the change was made. shown in FIG. 10 have been connected to WTS gateway 

At step 1510, WTS server 144 identifies the session 142, (6) Master Applets (124A and 124N), DTS Applets 

created for terminal 104A. (126A and 126N) and SessionID Applets (128A and 128N) 

At step 1512, WTS server 144 stores the URL and update 20 naV e been downloaded into terminals 104A and 104N 

of name field 1202 into data list 1. respectively, (7) the agent has selected and joined the session 

It should be noted that the operation shown in FIG. 15 can created for browser 114A, (8) at browser 114N, the second 

be used to perform data tracking for the other four data fields browser ^ance for lhe t ^ shown in HG 10 k bd 

on web page 1200, and to perform data tracking for all svnchr0Dized ^th browser n 4 A, and (9) bi-direction syn- 

participant terminals 25 c hronization has been selected for browsers 114A and 11 4N. 

Referring to FIG. 16, there is shown a flowchart lllustrat- A . an , ^ , . n m . , 

ing the operation of repeating browsing activities previously t * s ' e P the | f8 ent a ^ 0D an< \ ™ ews the 

performed to a web pa£e in a session, in accordance with the for u aU ^ ™* pages P reviouslv browsed b ? browser 

present invention. 114A m the session * 

In the example shown in FIG. 16, it is assumed that: (1) At step 1704, to re-browse all web pages previously 

a user at terminal 104A had previously browsed web pages 30 browsed by browser 114A, the agent selects Session Replay 

from consumer page repository 146 via browser 114A, (2) a button 820 in the agent session interface as shown in FIG. 

session list 1 shown in FIG. 6 was created for browser U4A, 10. 

(3) an agent is on duty at terminal 104N in a call center, and At step 1706, agent Master Applet 124N sends WTS 

agent class has been assigned to browser 104N, (4) at server 144 a command together with the session selected, via 

browser 114N, the first and second browser instances for the 35 its respective socket connection. 

agent as shown in FIG. 10 have been displayed, (5) via At 1708, upon receiving the command, WTS server 144 

respective socket connections established by Master Applet retrieves the web pages identified by the URLs in the 

124N, the first and second browser instances for the agent as selected session from HTTP server 152 (see FIG. 2). If the 

shown in FIG. 10 have been connected to WTS gateway user browser previously performed any changes to the data 

142, (6) Master Applets (124A and 124N), DTS Applets fields of the web pages, these changes are stored in data list 

(126A and 126N) and SessionID Applets (128A and 128N) 40 1 as shown in FIG. 6. 

have been downloaded into terminals 104A and 104N At 1710, WTS server 144 sequentially sends the web 

respectively, (7) the agent has selected and joined the session pages to user and agent Master Applets 124A and 124N, in 

created for browser 114A, (8) at browser 114N, the second accordance with the loading and unloading times stored in 

browser instance for the agent as shown in FIG. 10 is being URL history list 1 (see FIG. 6). If the user browser previ- 

synchronized with browser 114A, and (9) bi-direction syn- ously performed any changes to the data fields of the web 

chronization has been selected for browsers 114 A and 114N. pages, WTS server 144 also sends these changes to user and 

At step 1602, the agent selects a session and reviews the agent Master Applets 124A and 124N, in accordance with 

URLs for all the web pages previously browsed by browser the loading and unloading times stored in data list 1. 

114 A in the selected session. At 1712, user and agent Master Applets 124 A and 124N 

At step 1604, to display an individual web page previ- 50 instruct their respective browsers 114A and 114N to dis- 

ously browsed by browser 114A, the agent selects a URL played the web pages sent from WTS server 144. If the user 

from scrollable list box 818 (or scrollable list box 858) and browser previously performed any changes to the web pages 

double-clicks on it. to the data fields of the web pages, user and agent Master 

At step 1606, agent Master Applet 124N sends WTS Applet 124A and 124N receive the final data values of the 

server 144 a command together with the selected URL, via 55 changes and cause each data field to display these changes 

its respective socket connection. on the web pages. 

At 1608, upon receiving the command, WTS server 144 Since the loading time and unloading time of the URLs 

retrieves the web page identified by the URL from HTTP and the setting time for data fields are recorded in URL 

server 152 (see FIG. 2). If the user browser previously history list 1 and data list 1, if desired, all the web pages 

performed any changes to the data fields of the web page, identified by the URLs and the activities performed to the 

these changes are stored in data list 1 as shown in FIG. 6. 60 data fields can be duplicated (loading the web page, setting 

At 1610, WTS server 144 sends the web page to user and data fields on the web page, and unloading the web page) 

agent Master Applets 124 A and 124N. If the user browser according to the timing information. If desired, the sequence 

previously performed some changes to the data fields of the of URLs and data entry may be played proportionally faster 

web page, WTS server 114 also sends these changes to user or slower than the original session, 

and agent Master Applets 124A and 124N. 65 It should be noted that, in the above-described 

At step 1614, agent and user Master Applets 124A and embodiments, all the Applets (Master Applets, DTS Applets, 

124N instruct their respective browsers to display the web SessionID Applets, and Agent Applet) embedded into web 
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pages are written using Java. However, some or all of these 
Applets can be written using a browser script language, such 
as Java Script. More specifically, some of the features 
provided by these Applets can be directly written into web 
pages using the browser script language, instead of using 
applet tags to link these Applets. When a web browser 
downloads a web page containing the Applets written in 
browser script language, it stores these Applets into the 
memory area of the terminal on which the web browser is 
running, and then initializes and invokes them. It should also 
be noted that a session could conceivably be replayed 
directly from database 156 in FIG. 1. In this scenario, the 
session has been terminated, but because all of the necessary 
data was stored in a database prior to the session being 
purged, an active session is not required to implement 
playback. 

It can thus be seen that there has been provided by the 
present invention a new and useful mechanism to record the 
detailed browsing activities of an individual browser, and to 
reproduce a recorded set of browser activities at one or more 
browsers. The invention described herein has utility in a 
variety of applications including: 

Consumer behavior analysis. The present invention can 
assist web developers or other interested parties in 
analyzing an individual consumer's web browsing 
activities. This function is comparable to the type of 25 
consumer behavior analysis currently performed in 
retail stores when analyzing the specific actions of a 
consumer as he or she walks through a store making 
selections. 

Quality assurance. Currently telephone conversations 
between customers and agents in call centers are 
recorded for quality assurance purposes. Supervisors 
are thereafter able to listen to the recorded conversa- 
tions to insure that customers are treated properly and 
that agents are following the appropriate procedures. 
When web conferencing is used in concert with a call 
center, it may be appropriate to record the synchronized 
browser activity between the call center agent and the 
end consumer. This recorded web session could be 
played back in synchronization with the voice record of 
the call. 

Training. The present invention enables the presentation 
of a recorded sequence of web pages to a group of 
individuals, each utilizing a separate web browser. The 
ability to control the rate of playback would allow an 
instructor to slow down the presentation to answer 
questions, or to pause the presentation to take a break, 
provide comment or answer questions. 
While the invention has been illustrated and described in 
detail in the drawing and foregoing description, it should be 
understood that the invention may be implemented through 
alternative embodiments within the spirit of the present 
invention. Thus, the scope of the invention is not intended to 
be limited to the illustration and description in this 
specification, but is to be defined by the appended claims. 
What is claimed is: 

1. A method for repeating browsing activities performed 
during a web browsing session using a first web browser on 
a first client computer, said browsing activities including a 
series of requests for retrieving web pages from a web 
server, the method comprising the steps of: 

(a) receiving at said web server an initial request from said 
first web browser for retrieving an initial page, said 
initial request indicating the beginning of said web 
browsing session; 

(b) creating a session file at said web server for said web 
browsing session in response to said initial request; 
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(c) recording said browsing activities performed during 
said web browsing session to said session file; 

(d) assigning a unique session ID to said session file; 

(e) retrieving the browsing activities recorded during said 
web browsing session to said session file from said web 
server to a second web browser in response to a request 
from said second web browser identifying said session 
file by said session ID; and 

(f) repeating the recorded browsing activities at said 
second web browser; and wherein 

said web server includes multiple session files, each one 
of said multiple session files identified by a unique 
session ID, said multiple session files corresponding to 
multiple web browsing sessions occurring at said first 
client computer. 

2. The method according to claim 1, further comprising 
the steps of: 

(d) retrieving the recorded browsing activities to a third 
web browser; 

(e) repeating the recorded browsing activities at said third 
web browser, and 

(f) synchronizing the browsing activities that are being 
repeated at said second and third web browsers. 

3. The method according to claim 2, wherein: 

said step of recording browsing activities performed using 
said first web browser includes the step of recording a 
time reference for said browsing activities; and 

said step of repeating the recorded browsing activities at 
said second web browser includes the step of repeating 
browsing activities according to said time references. 

4. In a computer network including a network server and 
at least one client computer, a method for repeating brows- 
ing activities occurring during multiple web browsing ses- 
sions on said at least one client computer, the method 
comprising the steps of: 

(a) creating a plurality of session files at said network 
server, each of said session files being associated with 
a respective one of said multiple web browsing ses- 
sions; 

(b) recording the browsing activities for each of the 
multiple web browsing sessions into associated session 
files at the network server; 

(c) assigning a unique session ID to each one of said 
session files; 

(d) selecting one of said session files for replay, said 
selected session file being identified for selection by its 
assigned session ID; 

(e) retrieving the selected session files to an administra- 
tive web browser; and 

(f) repeating the recorded browsing activities associated 
with said selected session file at said administrative 
web browser. 

5. The method according to claim 4, wherein: 

said step of recording the browsing activities for each of 
said web browsing sessions into associated session files 
at the network server includes the step of recording a 
time reference for said browsing activities with each 
session file; and 

said step of repeating the recorded browsing activities 
associated with said selected session file at said admin- 
istrative web browser includes the step of repeating 
browsing activities according to the time reference 
associated with said selected session file. 
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(57) ABSTRACT 

A visual Web site analysis program, implemented as a 
collection of software components, provides a variety of 
features for facilitating the analysis, management and load- 
testing of Web sites. A mapping component scans a Web site 
over a network connection and builds a site map which 
graphically depicts the URLs and links of the site. Site maps 
are generated using a unique layout and display methodol- 
ogy which allows the user to visualize the overall architec- 
ture of the Web site. Various map navigation and URL 
filtering features are provided to facilitate the task of iden- 
tifying and repairing common Web site problems, such as 
links to missing URLs. A dynamic page scan feature enables 
the user to include dynamically-generated Web pages within 
the site map by capturing the output of a standard Web 
browser when a form is submitted by the user, and then 
automatically resubmitting this output during subsequent 
mappings of the site. An Action Tracker module detects user 
activity and behavioral data (link activity levels, common 
site entry and exit points, etc.) from server log files and then 
superimposes such data onto the site map. A Load Wizard 
module uses this activity data to generate testing scenarios 
for load testing the Web site. 
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This application is a continuation of application Ser. No. 
09/315,795, filed May 21, 1999, which is a continuation of 
Appl. Ser. No. 08/949,680, filed Oct. 14, 1997 (now U.S. 
Pat. No. 5,974,572), which is a continuation-in-part of to 
application Ser. No. 08/840,103, filed Apr. 11, 1997, which 
claims the benefit of U.S. Provisional Appl. No. 60/028,474, 
filed Oct. 15, 1996. 

MICROFICHE APPENDIX 1$ 

This specification includes a microfiche appendix 
(consisting of 1 sheet with 51 frames) which contains partial 
source code listings (Appendices A and C) and an applica- 
tion program interface specification (Appendix B) of a 
software product that embodies the invention. These mate- 20 
rials form part of the disclosure of the specification. 

This source code listings and screen displays of this 
specification are subject to copyright protection. The copy- 
right owner has no objection to the facsimile reproduction of 
the patent document or portions thereof as it appears in the 25 
files or records of the U.S. Patent and Trademark Office, but 
otherwise reserves all rights whatsoever. 

FIELD OF THE INVENTION 

The present invention relates to software tools for load- 30 
testing Web sites and other types of client-server systems. 
More particularly, the invention relates to a method of 
efficiently generating a load testing scenario that allows a 
Web site to be tested according to browsing behaviors of 
typical users. 35 

BACKGROUND OF THE INVENTION 

With the increasing popularity and complexity of Internet 
and intranet applications, the task of managing Web site 
content and maintaining Web site effectiveness has become 40 
increasingly difficult. Company Webmasters and business 
managers are routinely faced with a wide array of burden- 
some tasks, including, for example, the identification and 
repair of large numbers of broken links (i.e., links to missing 
URLs), the monitoring and organization of large volumes of 45 
diverse, continuously-changing Web site content, and the 
detection and management of congested links. These prob- 
lems are particularly troublesome for companies that rely on 
their respective Web sites to provide mission-critical infor- 
mation and services to customers and business partners. 50 

Several software companies have developed software 
products which address some of these problems by gener- 
ating graphical maps of Web site content and providing tools 
for navigating and managing the content displayed within 
the maps. Examples of such software tools include Web- 55 
Mapper™ from Netcarta Corporation and Web Analyzer™ 
from In Context Corporation. Unfortunately, the graphical 
site maps generated by these products tend to be difficult to 
navigate, and fail to convey much of the information needed 
by Webmasters to effectively manage complex Web sites. As 60 
a result, many companies continue to resort to the burden- 
some task of manually generating large, paper-based maps 
of their Web sites. In addition, many of these products are 
only capable of mapping certain types of Web pages, and do 
not provide the types of analysis tools needed by Webmas- 65 
ters to evaluate the performance and effectiveness of Web 
sites. 
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Another problem in the field of Web site and intranet 
management relates to the ability of the site to handle peak 
loads. The heavy use of a site can lead to a significant 
degradation in performance, or even a complete loss of 
service. Examples of Web sites that have undergone severe 
performance degradations during heavy use include the IBM 
Olympic site during the 1996 summer Olympics, and the 
CNN Interactive Election Day site during the 1996 presi- 
dential election. 

Mercury Interactive Corporation, the assignee of the 
present application, has addressed this problem by develop- 
ing software tools that allow companies to load-test their 
World Wide Web and intranet sites. Mercury Interactive's 
LoadRunner® 4.5 and Astra™ SiteTest 1.0 products, for 
example, include functionality for sending large volumes of 
client requests (representing hundreds or thousands of con- 
current users) to a site while monitoring the site's perfor- 
mance. These products use a virtual user executable, or 
"Vuser," to send the client requests to the Web site while 
monitoring the site's performance. The Vuser sends the 
client requests to the site according to a pre-defined test 
script (also referred to as a "Viiser script" or "Web script"), 
which is in the general form of a list of the HTTP (HyperText 
Transport Protocol) messages to be sent to the site. In one 
implementation, up to 50 Vuser s can be run concurrently on 
a single Windows® or 95 workstation, with different Vusers 
using different test scripts. 

SUMMARY OF THE INVENTION 

To facilitate the generation of the test scripts, the Loa- 
dRunner® 4.0 and Astra™ SiteTest 1.0 products are pro- 
vided with a script generation tool referred to as the "Vuser 
Generator." This tool operates by capturing actual HTTP 
and/or HTTPS (Secure HTTP) traffic between a standard 
Web browser and the Web server, and recording this traffic 
into a script file. Thus, for example, to provide a Vuser that 
repeatedly accesses a particular sequence of Web pages, the 
user can launch the Viiser Generator, and then sequentially 
access the Web pages with a standard browser. 

To generate a test that emulates multiple concurrent users, 
the user can generate and save multiple test scripts (e.g., a 
set of 5 scripts), and then use a "Scenario Wizard" tool to 
define how these scripts will be used to test the site. For 
example, the user can define a group of 10 Vusers that will 
play back the same script, and can define other Vusers that 
will play back uniquely-assigned scripts. This test definition 
(referred to generally as a "test scenario" or "scenario") is 
then stored as a "scenario file" that can thereafter be loaded 
and run to test the site. 

While the Vuser Generator tool provides a simple method 
for generating test scripts, the tool requires the user to 
actively browse the Web site. This can be burdensome, 
especially if the Web site is large. For example, if the Web 
site has 200 pages, the user would normally have to browse 
all 200 pages to generate a test script (or a set of test scripts) 
that covers the entire site. The process of generating a 
scenario can also be cumbersome, especially if a large 
number of scripts are involved. 

In addition, the test scripts and scenarios generated by this 
method are not necessarily representative of the paths and 
browsing behaviors followed by typical visitors. (To distin- 
guish between users of the disclosed tools and regular users 
of the site, the terra "visitor*' is used herein to refer to the 
latter.) For example, the resulting load test may heavily 
stress an area of the site that is rarely accessed, while failing 
to adequately stress more popular areas of the site. 
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It is thus a goal of the invention to provide a test FIG. 10 illustrates the general decision process used by 

generation tool that eliminates the need for the user to Astra to scan a URL. 

browse the Web site or actively define the scenario. It is a FIG. 11 is a block diagram which illustrates a method 

further goal of the invention to provide a test generation used by Astra to scan dynamically-generated Web pages, 

method that aUows the site to be stressed according to 5 F , G n is a flow djagram wfaich muslrates lhe 

typical usage patterns. method for scanning dynamically-generated Web pages. 

In accordance with these goals, a software module and nGS 13 _ 15 m a nc& q{ screeQ d - { which 

associate method are provided for automatically generating ^ {qq of Ms d Jc 

test scenarios (including test scripts) based on information scanning feature 

stored within a server access log file. The server access log 10 ° . * , . .„ . . 

file ("log file") is preferably a standard-format log file of the ™. 16 u> a screen display which dlustrates the site map 

type commonly generated by commercially-available Web of ™. * 1 following lhe apphcation of a filter which filters 

servers. Tliese log files contain information about accesses ° ut a » "RLs (and associated links) having a status other 

that have been made to the site over some period of time man ° 

(typically a few weeks), including lhe IP addresses of the 15 FIG. 17 illustrates the general program sequence followed 

visitors, the URLs (Uniform Resource Locators) that were by Astra to generate filtered maps of the type shown in FIG. 

accessed, and the times and dates of the accesses. 16. 

In accordance with the invention, the software module FIG. 18 illustrates the filtered map of FIG. 16 redisplayed 

implements a scenario generation process that preserves the in Astra's Visual Web Display™ format, 

load distribution (e.g., the distribution of accesses among 20 FIG. 19 is a screen display which illustrates an activity 

URLs) represented within the log file. Thus, for example, if, monitoring feature of Astra. 

during the time period to which the log file corresponds, a pj G 2 0 illustrates a decision process used by Astra to 

first area of the site was accessed more heavily than a second generate link activity data (of the type illustrated in FIG. 19) 

area, the resulting test scenario will stress the first area more fi. om a server access log file. 

heavily than the second I area (according to the same general " nG n ^ fl mtn ^ which iUustratcs a 

proportions as within the log file). comparison tool of Astra. 

In a preferred embodiment the scenario generation pro- n . g & wfaich a ^ if 

cess involves using the IP addresses and timestamps within f ea mre of Astra 

the log file to "trace" the navigation paths taken by indi- ' 

vidual users. This produces a routes list which includes 30 FIGS * 23 and 24 are P artia l dxspltys which illus- 

information about the number of "hits" that occurred on trate layout features in accordance with another embodiment 

each link. The routes list is then translated into a scenario of tne invention. 

that comprises a set of test scripts (stored as script files) and FIG. 25 illustrates the process used by the Astra LoadRu- 

a scenario file. The scenario can thereafter be loaded and nner and SiteTest products to load -test Web sites, 

executed, using either the LoadRunner product or the Astra FIG. 26 is a partial screen display which illustrates the 

SiteTest product, to load-test the Web site. user interface of the Astra SiteTest Controller. 

BRIEF DESCRIPTION OF THE DRAWINGS FIGS. 27-29 are partial screen displays which illustrate 

™_ . - , c > u - *• 'ii « r k» an automated load-testing feature of the invention. 

The various features of the invention will now be te 

described in greater detail with reference to the drawings of « FIG. 30 is a flow diagram which illustrates the flow of 

a preferred software package known as Astra™ SiteManager <*ata during the automated generation of load test scenarios. 

("Astra"), its screen displays, and various related compo- FIGS. 31 and 32 are flow charts of a process for auto- 

nents. In these drawings, reference numbers are re-used, matically generating load testing scenarios using informa- 

where appropriate, to indicate a correspondence between tion stored within a server access log file. 

referenced items. 45 The screen displays included in the figures were generated 

FIG. 1 is a screen display which illustrates an example from screen captures taken during the execution of the Astra 

Web site map generated by Astra, and which illustrates the SiteManager and Astra SiteTest programs. The original 

menu, tool and filter bars of the Astra graphical user inter- screen captures have been modified to comply with patent 

face. office standards. 

FIGS. 2 and 3 are screen displays which illustrate respec- 50 nccrBiimnw nv tuc 

tive ~— W- views of the site »ap of HO. 1 T E S E« E 

FIG. 4 is a screen display which illustrates a split-screen 

display mode, wherein a graphical representation of a Web The description of the preferred embodiment is arranged 

site is displayed in an upper window and a textual repre- 55 within the following sections: 

sentation of the Web site is displayed in a lower window. j t Glossary of Terms and Acronyms 

FIG. 5 is a screen display which illustrates a navigational h Overview of Astra SiteManager 

aid of the Astra graphical user interface. U{ Map LayQUt and Display MethodoIogy 

FIG. 6 is a screen display illustrating a feature which Ty ^ G hical Uscr Intcrfacc 

allows a user to selectively view the outbound links or URL fin A 4 _ r . t 

ou V. Astra Software Architecture 

in a hierarchical display format. 

FIG. 7 is a block diagram which illustrates the general ^ Scanm °g Process 

architecture of Astra, which is shown in the context of a V"- Scanning and Mapping of Dynamically-Generated 

client computer communicating with a Web site. Pages 

FIG. 8 illustrates the object model used by Astra. 65 ^ Display of Filtered Maps 

FIG. 9 illustrates a multi-threaded process used by Astra IX. Tracking and Display of Visitor Activity 

for scanning and mapping Web sites. X. Map Comparison Tool 



08/12/2003, EAST Version: 1.04.0000 



US 6,549,' 

5 

XI. Link Repair Plug-in 

XII. Automated Generation of Load Testing Scenarios 

(a) Web Site Testing with LoadRunner and SiteTest 

(b) Overview of Scenario Generation Process 

(c) Two-Phase Translation Process 5 

(d) Source Code Listing 

XIII. Conclusion 

I. Glossary of Terms and Acronyms 

The following definitions and explanations provide back- 
ground information pertaining to the technical field of the 10 
present invention, and are intended to facilitate an under- 
standing of both the invention and the preferred embodi- 
ments thereof. Additional definitions are provided through- 
out the detailed description. 

Internet. The Internet is a collection of interconnected 15 
public and private computer networks that are linked 
together by a set of standard protocols (such as TCP/IP, 
HTTP, FTP and Gopher) to form a global, distributed 
network. 

20 

Document. Generally, a collection of data that can be 
viewed using an application program, and that appears 
or is treated as a self-contained entity. Documents 
typically include control codes that specify how the 
document content is displayed by the application pro- 25 
gram. An "HTML document" is a special type of 
document which includes HTML (HyperText Markup 
Language) codes to permit the document to be viewed 
using a Web browser program. An HTML document 
that is accessible on a World Wide Web site is com- 3Q 
monly referred to as a "Web document" or "Web page." 
Web documents commonly include embedded 
components, such as GIF (Graphics Interchange 
Format) files, which are represented within the HTML 
coding as links to other URLs. (See "HTML" and 35 
"URL" below.) 

Hyperlink. A navigational link from one document to 
another, or from one portion (or component) of a 
document to another. Typically, a hyperlink is dis- 
played as a highlighted word or phrase that can be 40 
clicked on using the mouse to jump to the associated 
document or document portion. 

Hypertext System. A computer-based informational sys- 
tem in which documents (and possibly other types of 
data entities) are linked together via hyperlinks to form 45 
a user-navigable "web." Although the term "text", 
appears within "hypertext," the documents and hyper- 
links of a hypertext system may (and typically do) 
include other forms of media. For example, a hyperlink 
to a sound file may be represented within a document 50 
by graphic image of an audio speaker. 

World Wide Web. A distributed, global hypertext system, 
based on an set of standard protocols and conventions 
(such as HTTP and HTML, discussed below), which 
uses the Internet as a transport mechanism. A software 55 
program which allows users to request and view World 
Wide Web ("Web") documents is commonly referred to 
as a "Web browser," and a program which responds to 
such requests by returning ("serving") Web documents 
is commonly referred to as a "Web server." 60 

Web Site. As used herein, "web site" refers generally to a 
database or other collection of inter-linked hypertextual 
documents ("web . documents") and associated data 
entities, which is accessible via a computer network, 
and which forms part of a larger, distributed inform a- 65 
tional system. Depending upon its context, the term 
may also refer to the associated hardware and/or soft- 
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ware server components used to provide access to such 
documents. When used herein with initial capitalization 
(i.e., "Web site"), the term refers more specifically to a 
web site of the World Wide Web. (In general, a Web site 
corresponds to a particular Internet domain name, such 
as "merc-int.com," and includes the content of or 
associated with a particular organization.) Other types 
of web sites may include, for example, a hypertextual 
database of a corporate "intranet" (i.e., an internal 
network which uses standard Internet protocols), or a 
site of a hypertext system that uses document retrieval 
protocols other than those of the World Wide Web. 

Content Object. As used herein, a data entity (document, 
document component, etc.) that can be selectively 
retrieved from a web site. In the context of the World 
Wide Web, common types of content objects, include 
HTML documents, GIF files, sound files, video files, 
Java applets and aglets, and downloadable applications, 
and each object has a unique identifier (referred to as 
the "URL") which specifies the location of the object. 
(See "URL" below) 

URL (Uniform Resource Locator). A unique address 
which fully specifies the location of a content object on 
the Internet. The general format of a URL is protocol:// 
machine-address/path/filename. (As will be apparent 
from the context in which it is used, the term "URL" is 
also used herein to refer to the corresponding content 
object itself.) 

Graph/Tree. In the context of database systems, the term 
"graph" (or "graph structure") refers generally to a data 
structure that can be represented as a collection of 
interconnected nodes. As described below, a Web site 
can conveniently be represented as a graph in which 
each node of the graph corresponds to a content object 
of the Web site, and in which each interconnection 
between two nodes represents a link within the Web 
site. A "tree" is a specific type of graph structure in 
which exactly one path exists from a main or "root" 
node to each additional node of the structure. The terms 
"parent" and "child" are commonly used to refer to the 
interrelationships of nodes within a tree structure (or 
other hierarchical graph structure), and the term "leaf* 
or "leaf node" is used to refer to nodes that have no 
children. For additional information on graph and tree 
data structures, see Alfred V. Aho et al, Data Structures 
and Algorithms, Addison -Wesley, 1982. 

TCP/IP (Transfer Control Protocol/Internet Protocol). A 
standard Internet protocol which specifies how com- 
puters exchange data over the Internet. TCP/IP is the 
lowest level data transfer protocol of the standard 
Internet protocols. 

HTML (HyperText Markup Language). Astandard coding 
convention and set of codes for attaching presentation 
and linking attributes to informational content within 
documents. During a document authoring stage, the 
HTML codes (referred to as "tags") are embedded 
within the informational content of the document. 
When the Web document (or "HTML document") is 
subsequently transmitted by a Web server to a Web 
browser, the codes are interpreted by the browser and 
used to parse and display the document. In addition to 
specifying how the Web browser is to display the 
document, HTML tags can be used create hyperlinks to 
other Web documents. For more information on 
HTML, see Ian S. Graham, The HTML Source Book, 
John Wiley and Sons, Inc., 1995 (ISBN 0471-11894-4). 
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HTTP (Hypertext Transfer Protocol). The standard World and highlight modifications made to a Web site since a prior 

Wide Web client-server protocol used for the exchange mapping. In addition, users can utilize a Dynamic Scan™ 

of information (such as HTML documents, and client feature of Astra to automatically append dynamically- 

requests for such documents) between a Web browser generated Web pages (such as pages generated using CGI 

and a Web server. HTTP includes several different 5 scripts) to their maps. In addition, using Astra's activity 

types of messages which can be sent from the client to monitoring features, users can monitor visitor activity levels 

the server to request different types of server actions. on individual links and URLs, and study visitor behavior 

For example, a "GET* message, which has the format p anera s during Web site visits. Further, the user can invoke 

GET <URL>, causes the server to return the content a wizard™ feature to cause the automatic generation 

object located at the specified URL. io of a load-testing scenario, which can in-turn be used to 

Webcrawling. Generally, the process of accessing and load-test the site using Mercury Interactive 's LoadRunner 

processing web site content (typically using an auto- aac j Astra SiteTest products. 

mated searching/parsing program) and generating a Astra is based on a highly extensible architecture which 
condensed representation of such content. Webcrawl- facilitates the addition of new tools to the Astra framework, 
ing routines are commonly used by commercial Inter- i$ As part of this architecture, a "core" Astra component 
net search engines (such as Infoseek™ and Alta (which includes the basic Web site scanning and mapping 
Vista™) to generate large indexes of the terms that functionality) has an API for supporting the addition of 
appear within the various Web pages of the World Wide plug-in components. This API includes functions for allow- 
Web. ing the plug-in components to manipulate the display of the 
API (Application Program Interface). A software interface 20 site map, and to display their own respective data in con- 
that allows application programs (or other types of junction with the Astra site map. Through this API, new 
programs) to share data or otherwise communicate with applications can be added which extend the functionality of 
one another. A typical API comprises a library of API the package while taking advantage of the Astra mapping 
functions or "methods" which can be called in order to scheme. 

initiate specific types of operations. 25 Throughout this description, names of product features 
CGI (Common Gateway Interface). A standard interface and software components are used with initial capitalization, 
which specifies how a Web server (or possibly another These names are used herein for ease of description only, 
information server) launches and interacts with exter- and are not intended to limit the scope of the invention, 
nal programs (such as a database search engine) in FIGS. 1-3 illustrate Astra's primary layout methodology, 
response to requests from clients. With CGI, the Web 30 referred to herein as "Visual Web Display™," for displaying 
server can serve information which is stored in a format graphical representations ("site maps" or "maps") of Web 
that is not readable by the client, and present such sites. These figures will also be used to describe some of the 
information in the form of a client-readable Web page. graphical user interface (GUI) features of Astra. 
A CGI program (called a "CGI script") may be FIG. 1 illustrates a site map 30 of a demonstration Web 
invoked, for example, when a Web user fills out an 35 site which was derived from the actual Web site of Mercury 
on-screen form which specifies a database query. For Interactive (i.e., the URLs accessible under the "mere- 
more information on CGI, see Ian S. Graham, The int.com" Internet domain name). (For purposes of this 
HTML Source Book, John Wiley and Sons, Inc., 1995 detailed description, it may be assumed that "Web site" 
(ISBN 0471-11894-4), pp. 231-278. refers to the content associated with a particular Internet 
OLE (Object Linking and Embedding). An object 40 domain name.) The Web site is depicted by Astra as a 
technology, implemented by Windows-based applications, collection of nodes, with pairs of nodes interconnected by 
which allows objects to be linked to one another and lines. Each node of the map represents a respective content 
embedded within one another. OLE Automation, which is a object of the Web site and corresponds to a respective URL. 
feature of OLE 2, enables a program's functionality to be (The term "URL" is used herein to refer interchangeably to 
exposed as OLE objects that can be used to build other 45 both the address of the content object and to the object itself; 
applications. For additional information on OLE and OLE where a distinction between the two is important, the term 
Automation, see OLE 2 Programmer's Reference Manual, "URL" is followed by an explanatory parenthetical.) 
Volume One, Microsoft Corporation, 1996 (ISBN 1-55615- Examples of URLs (content objects) which may exist within 
628-6). a typical Web site include HTML documents (also referred 
II. Overview of Astra SiteManager 50 to herein as "Web pages"), image files (e.g., GIF and PCX 
The Astra SiteManager program ("Astra") includes van- files), mail messages, Java applets and aglets, audio files, 
ous features for facilitating the mapping, analysis (including video files, and applications. 

load-testing) and management of Web sites. The program As generally illustrated by FIGS. 3 and 4, different icons 

runs on a client computer under either the Windows® NT or are used to represent the different URL types when the nodes 

the Windows® 95 operating system. 55 are viewed in a sufficiently zoomed -in mode. (Generic icons 

Given the address of a Web site's home page, Astra of the type best illustrated by FIG. 18 are used to display 

automatically scans the Web site and creates a graphical site nodes that fall below a predetermined size threshold.) As 

map showing all of the URLs of the site and the links described below, special icons and visual representations are 

between these URLs. The layout and display method used also used to indicate status information with respect to the 

by Astra for generating the site map provides a highly 60 URLs. For example, special icons are used to depict, 

intuitive, graphical representation which allows the user to respectively, inaccessible URLs, URLs which are missing, 

visualize the layout of the site. Using this mapping feature, URLs for which access was denied by the server, and URLs 

in combination with Astra's powerful set of integrated tools which have been detected but have not been scanned. (The 

for navigating, filtering and manipulating the Web site map, term "scan" refers generally to the process of sending 

users can intuitively perform such actions as isolate and 65 informational requests to server components of a computer 

repair broken links, focus in on Web pages (and other network, and in the context of the preferred embodiment, 

content objects) of a particular content type and/or status, refers to the process of sending requests to Web server 
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TABLE 1 



URL Type 


Scanning Status 


HTML 


Not found 


HTML with Form 


Not Scanned 


Image 


Inaccessible 


Sound 


Access Denied 


Application 




Text 




Unknown 




Video 




Gopher 




FTP 




Dynamic Page 
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components to obtain Web site content associated with 
specific URLs.) 

The lines which interconnect the nodes (URL icons) in 
FIGS. 1-3 (and the subsequent figures with screen displays) 
represent links between URLs. As is well understood in the 
art, the functions performed by these links vary according to 
URL type. For example, a link from one HTML document 
to another HTML document normally represents a hyperlink 
which allows the user to jump from one document to the 
other while navigating the Web site with a browser. In FIG. 
1, an example of a hyperlink which links the home page 
URL (shown at the center of the map) to another HTML 
page (displayed to the right of the home page) is denoted by 
reference number 32. (As generally illustrated in FIG. 1 and 
the other figures which illustrate screen displays, regular 
HTML documents are displayed by Astra as a shaded 
document having text thereon.) A link between an HTML 
document and a GIF file, such as link 36 in FIG. 3, normally 
represents a graphic which is embedded within the Web 
page. 

Maps of the type illustrated in FIG. 1 are generated by 
Astra using an HTTP-level scanning process (described 
below) which involves the reading and parsing the Web 
site's HTML pages to identify the architecture (i.e., the 
arrangement of URLs and links) of the Web site, and to 
obtain various status information (described below) about 
the Web site's URLs. The basic scanning process used for 
this purpose is generally similar to the scanning process used 
by conventional Webcrawlers. As part of Astra's Dynamic 
Scan feature, Astra additionally implements a special 
dynamic page scanning process which permits dynamically- 
generated Web pages to be scanned and included in the Web 
site map. As described below, this process involves captur- 
ing the output of a Web browser when the user submits an 
HTML-embedded form (such as when the user submits a 
database query), and then reusing the captured dataset 
during the scanning process to recreate the form submission 
and append the results to the map. 

Table 1 lists the predefined icons that are used by Astra to 
graphically represent different URL types within site maps. 
As illustrated, the URL icons generally fall into two catego- 
ries: object- type ("URL type") icons and status icons. The 
object-type icons are used to indicate the content or service 
type of URLs that have been successfully scanned. The 
status icons are used to indicate the scanning status (not 
found, access denied, etc.) of URLs for which either (i) 
scanning has not been performed, or (ii) scanning was 
unsuccessful. Various examples of these two types of icons 
are included in the figures. 



30 



45 



50 



55 



Once the map has been generated, the user can interac- 
tively navigate the map using various navigation tools of the 
Astra GUI, such as the zoom-in and zoom-out buttons 34, 36 
(FIG. 1) and the scrolling controls 40, 42 (FIGS. 2 and 3). 



65 



To zoom-in on a particular region of the map 30, the user can 
click on the zoom-in button 34 and then use the mouse to 
draw a box around the map region of interest; Astra will then 
re-size the highlighted region to generally fit the display 
screen. As will be recognized by those skilled in the art, the 
ability to zoom in and out between high level, perspective 
views which reveal the overall architecture of the site, and 
magnified (zoomed- in) sub-views which reveal URL- 
specific information about the Web site, greatly facilitates 
the task of navigating and monitoring Web site content. 

As generally illustrated by FIG. 3, the annotations (page 
titles, filenames, etc.) of the URLs begin to appear (below 
the associated icons) as the user continues to zoom in. As 
further illustrated by FIG. 3, the URL (address) of a node is 
displayed when the mouse cursor is positioned over the 
corresponding icon. 

While navigating the map, the user can retrieve a URL 
(content object) from the server by double-clicking on the 
corresponding URL icon; this causes Astra to launch the 
client computer's default Web browser (if not already 
running), which in-turn retrieves the URL from the Web 
server. For example, the user can double-click on the URL 
icon for an HTML document (using the left mouse button) 
to retrieve and view the corresponding Web page. When the 
user clicks on a URL icon using the right mouse button, a 
menu appears which allows the user to perform a variety of 
actions with respect to the URL, including viewing the 
URL's properties, and launching an HTML editor to retrieve 
and edit the URL. With reference to FIG. 3, for example, the 
user can click on node 44 (using the right mouse button), and 
can then launch an HTML editor to edit the HTML docu- 
ment and delete the reference to missing URL 45. (As 
illustrated by FIG. 3, missing URLs are represented within 
Astra maps by a question mark icon.) 

One important feature of Astra, referred to herein as 
"Automatic Update," allows the user to update an existing 
Web site map to reflect any changes that have been made to 
the map since a prior mapping of the site. To initiate this 
feature, the user selects a "start Automatic Update" button 
37 (FIG. 1), or selects the corresponding menu item, while 
viewing a site map. This initiates a re-scanning process in 
which Astra scans the URLs of the Web site and updates the 
map data structure to reflect the current architecture of the 
site. As part of this process, Astra implements a caching 
protocol which eliminates the need to download URLs and 
URL headers that have not been modified since the most 
recent mapping. (This protocol is described below under the 
heading "SCANNING PROCESS.") This typically allows 
the map to be updated in a much shorter period of time than 
is required to perform the original mapping. This feature is 
particularly useful for Webmasters of large Web sites that 
have dynamically-changing content. 

Other features of Astra are described in the following 
sections. 

III. May Layout and Display Methodology (FIGS. 1-3, 23 
and 24) 

An important aspect of the invention is the methodology 
used by Astra for presenting the user with a graphical, 
navigable representation of the Web site. This feature of 
Astra, which is referred to as Visual Web Display 
(abbreviated as " VWD" herein), allows the user to, view and 
navigate complex Web structures while visualizing the inter- 
relationships between the data entities of such structures. 
The method used by Astra to generate VWD site maps is 
referred to herein as the "Solar Layout method," and is 
described at the end of this section. 

One aspect of the VWD format is the manner in which 
children nodes ("children") are displayed relative to their 
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respective parent nodes ("parents"). (In the context of the 
preferred embodiment, the term "node" refers generally to a 
URL icon as displayed within the site map.) As illustrated by 
the collection of nodes shown in FIG. 3, the paren! 44 is 
displayed in the center of the cluster, and the seven children 
48 are positioned around the parent 44 over an angular range 
of 360 degrees. One benefit of this layout pattern is that it 
allows collections of related nodes to be grouped together on 
the screen in relatively close proximity to one another, 
making it easy for the user identify the parent-child rela- 
tionships of the nodes. This is in contrast to the expandable 
folder type representations used by Webmapper™, the Win- 
dows® 95 Explorer, and other Windows® applications, in 
which it is common for a child to be separated from its 
parent folder by a long list of other children. 

In this FIG. 3 example, all of the children 48 are leaf 
nodes (i.e., nodes which do not themselves have children). 
As a result, all of the children 48 are positioned approxi- 
mately equidistant from the parent 44, and are spaced apart 
from one another by substantially equal angular increments. 
Similar graphical representations to that of FIG. 3 are 
illustrated in FIG. 1 by node clusters 52, 54 and 56. As 
illustrated by these three clusters in FIG. 1, both (i) the size 
of parent icon, and (ii) the distance from the parent to its 
children, are proportional to the number of immediate chil- 
dren of the parent. Thus, for example, cluster 56 has a larger 
diameter (and a larger parent icon) than clusters 52 and 54. 
This has the desirable effect of emphasizing the pages of the 
Web site that have the largest numbers of outgoing links. (As 
used herein, the term "outgoing links" includes links to GIF 
files and other embedded components of document.) 

As best illustrated by cluster 64 in FIG. 2, of which node 
65 is the primary parent or "root" node, children which have 
two or more of their own children (i.e., grandchildren of the 
root) are positioned at a greater distance from the root node 
65 than the leaf nodes of the cluster, with this distance being 
generally proportional to the size of the sub-cluster of which 
the child is the parent. For example, node 66 (which has 3 
children) is positioned farther from the cluster's root node 
65 than leaf nodes 70; and the parent of cluster 60 is 
positioned farther from the root node 65 than node 66. As 
illustrated in FIG. 1, this layout principal is advantageously 
applied to all of the nodes of the Web site that have children. 
The recursive method (referred to as "Solar Layout") used 
by Astra to implement these layout and display principles is 
described below. 

Another aspect of the layout method is that the largest 
"satellite" cluster of a parent node is centered generally 
opposite from (along the same line as) the incoming link to 
the parent node. This is illustrated, for example, by cluster 
54 in FIG. 1 and by cluster 60 in FIG. 2, both of which are 
positioned along the same line as their respective parents. 
This aspect of the layout arrangement tends to facilitate 
visualization by the user of the overall architecture of the 
site. 

As will be apparent from an observation of FIG. 1, the 
graphical map produced by the application of the above 
layout and display principles has a layout which resembles 
the general arrangement of a solar system, with the home 
page positioned as the sun, the children of the borne page 
being in orbit around the sun, the grandchildren of the home 
page being in orbit around their immediate respective 
parents, and so on. One benefit of this mapping arrangement 
is that it is well suited for displaying the entire site map of 
a complex Web site on a single display screen (as illustrated 
in FIG. 1). Another benefit is that it provides an intuitive 
structure for navigating the URLs of a complex Web site. 
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While this mapping methodology is particularly useful for 
the mapping of Web sites, the methodology can also be 
applied, with the realization of similar benefits, to the 
mapping of other types of databases. For example, the VWD 

5 methodology can be used to facilitate the viewing and 
navigation of a conventional PC file system. 

Another benefit of this site map layout and display 
methodology is that the resulting display structure is well 
suited for the overlaying of information on the map. Astra 

10 takes full advantage of this benefit by providing a set of API 
functions which allow other applications (Astra plug-ins) to 
manipulate and add their respective display data to the site 
map. An example of an Astra plug-in which utilizes this 
feature is the Action Tracker™ tool, which superimposes 

is user activity data onto the site map based on analyses of 
server access log files. The Astra plug-in API and the Action 
Tracker plug-in are described in detail below. 

As illustrated in FIG. 1, all of the nodes of the site map 
(with the exception of the home page node) are displayed as 

20 having a single incoming link, even though some of the 
URLs of the depicted Web site actually have multiple 
incoming links. Stated differently, the Web site is depicted in 
the site map 30 as though the URLs are arranged within a 
tree data structure (with the home page as the main root), 

25 even though a tree data structure is not actually used. This 
simplification to the Web site architecture is made by 
extracting a span tree from the actual Web site architecture 
prior to the application of a recursive layout algorithm, and 
then displaying only those links which are part of the 

30 spanning tree. (In applications in which the database being 
mapped is already arranged within a tree directory structure, 
this step can be omitted.) As a result, each URL of the Web 
site is displayed exactly once in the site map. Thus, for 
example, even though a particular GIF file may be embed - 

35 ded within many different pages of the Web site, the GIF file 
will appear only once within the map. This simplification to 
the Web site architecture for mapping purposes makes it 
practical and feasible to graphically map, navigate and 
analyze complex Web sites in the manner described above. 

40 Because the Visual Web Display format does not show all 
of the links of the Web site, Astra supports two additional 
display formats which enable the user to display, 
respectively, all of the incoming links and all of the outgoing 
links of a selected node. To display all of the outgoing links 

45 of a given node, the user selects the node with the mouse and 
then selects the "display outgoing links" button 72 (FIG. 1) 
from the tool bar 46. Astra then displays a hierarchical view 
(in the general form of a tree) of the selected node and its 
outgoing links, as illustrated by FIG. 6. Similarly, to display 

50 the incoming links of a node, the user selects the node and 
then clicks on the "display incoming links" button 71. (A 
screen display illustrating the incoming links format is 
shown in FIG. 22.) To restore the Visual Web Display view, 
the user clicks on the VWD button 73. 

55 The Solar Layout method (used to generate VWD-formal 
site maps) generally consists of three steps, the second two 
of which are performed recursively on a node-by-node basis. 
These three steps are outlined below, together with associ- 
ated pseudocode representations. In addition, a source code 

60 listing of the method (in C++) is included in microfiche 
appendix as Appendix A. 

Step 1 — Select Span Tree 

In this step, a span tree is extracted from the graph data 
65 structure which represents the arrangement of nodes and 
links of the Web site. (The graph data structure is imple- 
mented as a "Site Graph" OLE object, as described below.) 
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Any standard span tree algorithm can be used for this 
purpose. In the preferred embodiment, a shortest-path span 
tree algorithm known as "Dijkstra's algorithm" is used, as 
implemented within the commercially-available LEDA 
(Library of Efficient Data types and Algorithms) software 
package. As applied within Astra, this algorithm finds the 
shortest paths from a main root node (corresponding to the 
Web site's home page or some other user-specified starting 
point) to all other nodes of the graph structure. The result of 
this step is a tree data structure which includes all of the 
URLs of the graph data structure with the home page 
represented as the main root of the tree. For examples of 
other span tree algorithms which can be used, see Alfred V. 
Aho et al, Data Structures and Algorithms, Addison-Wesley, 
1982.) 

Step 2 — Solar Plan 

This is a recursive step which is applied on a node -by- 
node basis in order to determine (i) the display size of each 
node, (ii) the angular spacings for positioning the children 
nodes around their respective parents, and (iii) the distances 
for spacing the children from their respective parents. For 
each parent node, the respective sizes of the parent's satel- 
lites are initially determined. (A "satellite" is any child of the 
parent plus the child's descendants, if any.) The satellite 
sizes are then used to allocate (a) angular spacings for 
positioning the satellites around the parent, and (b) the radial 
distances between the satellites and the parent. This process 
is repeated for each parent node (starting with the lower 
level parent nodes and working up toward the home page) 
until all nodes of the graph have been processed. The 
following is a pseudocode representation of this process: 
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-continued 

calculate final angle as the sum of the entry_angle and the angle 
allocated in Step 2; Calculate satellite center (x and y coordinates) 
based on new angle and distance from current node; Call So I a rP lace 
using the above-calculated angle and location. 

} 



In the above pseudocode representation, the "x" and "y" 

10 parameters specify the screen position for the placement of 
a node (icon), and the "entry_angle" parameter specifies the 
angle of the line (link) between the node and its respective 
parent. In the preferred embodiment, the method is imple- 
mented such that the largest satellite of a parent node is 

15 positioned using the same entry angle as the parent node, so 
that the satellite center, parent node, and parent of the parent 
node all fall generally along the same line. (The determina- 
tion of the largest satellite is performed in Step 2.) As 
indicated above, this aspect of the layout method is illus- 

20 trated in FIG. 1 by cluster (satellite) 54, which is positioned 
along the same line as both its immediate parent icon and the 
home page icon. 

A modified Solar Plan process will now be described with 
reference to the screen displays of FIGS. 23 and 24, and to 

25 the corresponding pseudocode representation below. This 
modified process incorporates two additional layout features 
which relate to the positioning of the satellites around a 
parent. These layout features are implemented within the 
attached source code listing (Appendix A), and are repre- 

30 sented generally by the highlighted text of the following 
pseudocode sequence: 



Node::SolarPlanO 
{ 

IF node has no children 

return basic graphical dimension for a single node 

ELSE 

For each linked node as selected in the span tree, call SolarPlanO 
recursively; 

Based on the sum of the sizes of the satellites, allocate angle for 
positioning satellites around parent, and set satellite distances from 
parent; 

Calculate size of present cluster (parent plus satellites). 

} 



A modified Solar Plan process which incorporates two 
additional layout features is described below and illustrated 
by FIGS. 23 and 24, 

Step 3 — Solar Place 

This step recursively positions the nodes on the display 
screen, and is implemented after Step 2 has been applied to 
all of the nodes of the graph. The sequence starts by 
positioning the home page at the center, and then uses the 
angle and distance settings calculated in Step 2 to position 
the children of the home page around the home page. This 
process is repeated recursively for each parent node until all 
of the nodes have been positioned on the screen. 



Node::SolarPlace(x, y, entry_angle) 
{ 

Move this node to location (x,y) 
For each satellite: 



Node::SolarPIan() 
{ 

IF node has no children 

return basic graphical dimension for a single node 

ELSE 

For each linked node as selected in the span tree, call SolarPlanO 
recursively; 

Based on the sum of the sizes of the satellites + minimal weight of 
the incoming link, allocate angle for positioning satellites around 
parent, and set satellite distances from parent; 
Sort satellite list as follows: smallest child fust, and in jumps of 
two next child up to the biggest, and then back to second biggest 
and in jumps of two down to smallest (e.g., 1, 3, 5 ... biggest, 
second biggest, ... 6, 4, 2); 

Calculate size of present cluster (parent plus satellites). 

} 



The first of the two layout features is illustrated by FIG. 

50 23, which is a partial screen display (together with associ- 
ated annotations) of a parent-child cluster comprising a 
parent 79 and seven children or satellites 75. This layout 
feature involves allocating an angular interval (e.g., 20 
degrees) to the incoming link 81 to the parent 79, and then 

55 angularly spacing the satellites 75 (which in this example are 
all leaf nodes) over the remaining angular range. In the 
preferred embodiment, this is accomplished by assigning a 
minimal weight (corresponding to the angular interval) to 
the incoming link 81, and then treating this link 81 as one of 

60 the outgoing links 83 when assigning angular positions to 
the satellites 75. As a result of this step, the satellites 75 are 
positioned around the parent 79 over an angular range of less 
than 360 degrees — in contrast to the clusters of FIGS. 1-5, 
in which the satellites are positioned over the full 360 range. 

65 (In this FIG. 23 example, because all of the satellites 75 are 
leaf nodes, the satellites 75 are positioned equidistant from 
the parent 79 with equal angular spacings.) One benefit of 



40 
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this added step is that it allows the user to more easily also includes code for generating load testing scenarios from 

distinguish the incoming link 81 to a parent 79 from the the link activity data. The user can also invoke a Link Doctor 

outgoing links 83 from the parent. With reference to the tool which facilitates the repairing of broken links. These 

angular notations of FIG. 23, the minimal weight is prefer- and other tools and features of Astra are described in the 

ably selected such that the angle 8 J between the incoming 5 subsequent sections. 

link 81 and each of the two adjacent parent -child links 83 is With further reference to FIG. 1, the Astra GUI includes 

greater than or equal to the minimum angle 6 2 between a tool bar 46 and a filter bar 47, both of which can be 

adjacent parent-child links 83 for the given cluster. This selectively displayed as needed. The tool bar 46 includes 

layout feature is also illustrated by FIG. 24. buttons for initiating commonly-performed operations. 

The second of the two additional layout features involves 10 From left to right in FIG. 1, these functions are as follows: 

ordering the satellites around the parent based on the respec- (a) start generation of new map, (b) open map file, (c) save 

tive sizes of the satellites. This feature comes into play when map to disk, (d) print, (e) size map to fit within window, (f) 

a parent node has multiple satellites that differ in size from zoom in, (g) zoom out, (h) display incoming links of selected 

one another. The layout arrangement which is produced by node; (i) display outgoing links of selected node, (j) display 

this feature is generally illustrated by FIG. 24, which shows 15 map in Visual Web Display format, (k) initiate Automatic 

a cluster having a parent node (labeled "CNN SHOWBIZ") Update, (1) pause Automatic. Update, (m) resume Automatic 

and 49 satellites. As illustrated by this screen image, the Update, (n) initiate Dynamic Scan, and (o) stop Dynamic 

satellites are ordered such that the smallest satellites 85 are Scan. (The function performed by each button is indicated 

angularly positioned closest to the incoming link 89 to the textually when the mouse cursor is positioned over the 

parent, and such that the largest satellites 91A-E are posi- 20 respective button.) 

tioned generally opposite from the incoming link 89. This is The filter bar 47 includes a variety of different filter 
preferably accomplished by sorting the satellites using the buttons for filtering the content of site maps. When the user 
sorting algorithm of the above pseudocode sequence (which clicks on a filter button, Astra automatically hides all links 
produces a sorted satellite list in which the satellites progress and pages of a particular type or status, as illustrated in FIG. 
upward from smallest to largest, and then progress down- 25 16 and discussed below. The filter buttons are generally 
ward from second largest to second smallest), and then divided into three groups: content/service filters 49, status 
positioning the satellites around the parent (starting at the filters 50, and location filters 51. From left to right in FIG. 
incoming link 89) in the order which results from the sorting 1, the content/service filters 49 filter out URLs of the 
process. In this example, the largest satellite 91A is posi- following content or service types: (a) HTML, (b) HTML 
tioned opposite the incoming link 89; the second and third 30 forms, (c) images, (d) audio, (e) CGI, (f) Java, (g) other 
largest satellites 91 B and 91C are positioned adjacent to the applications, (h) plain text, (i) unknown, (j) redirect, (k) 
largest satellite 91 A; the fourth and fifth largest satellites video, (1) Gopher, (m) FTP, and (n) all other Internet 
91D and 9 IE are positioned adjacent to the second and third services. The status filters 50 filter out URLs of the follow- 
largest satellites 9 IB and 91C (respectively); and so on. As ing statuses (from left to right): (a) not found, (b) inacces- 
is apparent from FIG. 24, this layout feature tends to produce 35 sible (e.g., no response from server), (c) access denied, (d) 
a highly symmetrical layout. not scanned, and (e) OK. The left-hand and right-hand 

Other aspects of the Solar Layout method will be apparent location filters 51 filter out local URLs and external URLs, 

from an observation of the screen displays and from the respectively, based on the domain names of the URLs, 

source code listing of Appendix A. Multiple filters can be applied concurrently. 

V Astra Graphical User Interface (FIGS. 1 and 4-6) 40 FIG. 4 illustrates a split-screen mode which allows the 

As illustrated in FIG. 1, the Astra menu bar includes seven user to view a graphical representation of the Web site in an 
menu headings: FILE, VIEW, SCAN, MAP, URL, TOOLS upper window 76 while viewing a corresponding textual 
and HELP. From the FILE menu the user can perform representation (referred to as "List View") in a lower win- 
various file-related operations, such as save a map file to dow 78. To expose the List View window 78, the user drags 
disk or open a previously generated map file. From the 45 and drops the separation bar 80 to the desired position on the 
VIEW menu the user can select various display options of screen. Each line of text displayed in the List View window 
the Astra GUI. From the SCAN menu the user can control 78 represents one node of the site map, and includes various 
various scanning-related activities, such as initiate or pause information about the node. For each node, this information 
the automatic updating of a map, or initiate a dynamic page includes: the URL (i.e., address), an annotation, the scanning 
scan session. From the MAP menu, the user can manipulate so status (OK, not found, inaccessible, etc.), the associated 
the display of the map, by, for example, collapsing (hiding) communications protocol (HTTP, mail to, FTP, etc.), the 
all leaf nodes, or selecting the Visual Web Display mode. content type, the file size (known only if the entire file has 
From the URL menu, the user can perform operations with been retrieved), the numbers of inbound links and outbound 
respect to user-selected URLs, such as display the URL's links, and the date and time of last modification. (The 
content with a browser, invoke an editor to modify the 55 outbound link and last modification information can be 
URL's content, and display the incoming or outgoing links exposed in the FIG, 4 screen display by dragging the 
to/from the URL, horizontal scrolling control 77 to the right.) 

From the TOOLS menu the user can invoke various As described below, this information about the nodes is 

analysis and management related tools. For example, the obtained by Astra during the scanning process, and is stored 

user can invoke a map comparison tool which generates a 60 in the same data structure 114 (FIG. 9) that is used to build 

graphical comparison between two maps. This tool is par- the map. As additionally described below, whenever the user 

ticularly useful for allowing the user to readily identify any initiates an Automatic Update, Astra uses the date/time of 

changes that have been made to a Web site's content since last modification information stored locally in association 

a previous mapping. The user can also invoke the Action with each previously-mapped HTML document to deter- 

Tracker tool, which superimposes link activity data on the 65 mine whether the document needs to be retrieved and 

Web site map to allow the user to readily ascertain the links parsed. (The parsing process is used to identify links to other 

and URLs that have the most hits. The Action Tracker tool URLs, and to identify other HTML elements relevant to the 
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mapping process.) As indicated above, this provides the 
significant advantage of allowing the Web site to be 
re-mapped without having to repeat the entire scanning/ 
parsing process. 

With further reference to FIG. 4, whenever the user 
selects a node in the upper window 76, the corresponding 
line in the List View window 78 is automatically high- 
lighted. (As illustrated by node 84 in FIG. 4, Astra graphi- 
cally represents the selection of a node by outlining the 
node's icon in black.) Likewise, whenever the user selects a 
line in the List View window 78, the corresponding node is 
automatically highlighted in the upper window 76. This 
feature allows the user to rapidly and efficiently associate 
each textual line with its graphical counterpart, and vice 
versa. In addition, by clicking on the headers 82 of the 
separation bar 80, the user can view the listed URLs in a 
sorted order. For example, if the user clicks on the "in links" 
header, Astra will automatically sort the list of URLs accord- 
ing to the number of incoming links, and then display the 
sorted listing in the List View window 78. 

FIG. 5 illustrates a Pan Window feature of Astra. This 
feature facilitates navigation of the site map while in a 
zoomed-in mode by presenting the user with a perspective 
view of the navigational position within the map. To display 
the Pan Window 86, the user selects the "Pan Window" 
menu option from the VIEW menu while viewing a map. 
Within the Pan Window, the user is presented with a display 
of the entire map 30, with a dashed box 87 indicating the 
portion of the map that corresponds to the zoomed-in screen 
display. As the user navigates the site map (using the 
scrolling controls 40, 42 and/or other navigational controls), 
the dashed box automatically moves along the map to track 
the zoomed-in screen display. The user can also scroll 
through the map by simply dragging the dashed box 87 with 
the mouse. In the preferred embodiment, the Pan Window 
feature is implemented in-part using a commercially- 
available from Stingray™ Corporation called SEC++, which 
is designed to facilitate the zoomed-in viewing of a general 
purpose graphic image. 

FIG. 6 illustrates the general display format used by Astra 
for displaying the outgoing links of a selected node 88. To 
display a node's outgoing links, the user selects the node 
with the mouse and then clicks on the "show outgoing links" 
button 72 on the tool bar. As illustrated, Astra then displays 
all outgoing links from the node (including any links that do 
not appear in the VWD site map), and displays additional 
levels of outgoing links (if any) which emanate from the 
children of the selected node. The display format used for 
this purpose is in the general format of a tree, with the 
selected node displayed as the root of the tree. An analogous 
display format (illustrated in FIG. 22) is used for displaying 
the incoming links to a node. 
V. Astra Software Architecture (FIGS. 7 and 8) 

FIG. 7 pictorially illustrates the general architecture of 
Astra, as installed on a client computer 92, As illustrated, the 
architecture generally consists of a core Astra component 94 
which communicates with a variety of different Astra plug- 
in applications 96 via a plug-in API 98. The Astra core 94 
includes the basic functionality for the scanning and map- 
ping of Web sites, and includes the above -described GUI 
features for facilitating navigation of Web site maps. 
Through the plug-in API 98, the Astra a core 94 provides an 
extensible framework for allowing new applications to be 
written which extend the basic functionality of the Astra 
core. As described below, the architecture is structured such 
that the plug-in applications can make extensive use of Astra 
site maps to display plug-in specific information. 



The Astra plug-ins 96 and API 98 are based on OLE 
Automation technology, which provides facilities for allow- 
ing the plug- in components to publish information to other 
objects via the operating system registry (not shown). (The 

5 "registry" is a database used under the Windows® 95 and 
Windows® NT operating systems to store configuration 
information about a computer, including information about 
Windows-based applications installed on the computer.) At 
start-up, the Astra core 94 reads the registry to identify the 

to Astra plug- ins that are currently installed on the client 
computer 92, and then uses this information to launch the 
(installed plug-ins. 

In a preferred implementation, the architecture includes 
five Astra plug-ins: Link Doctor, Action Tracker, Test World, 

15 Load Wizard and Search Meter. The functions performed by 
these plug-ins are summarized by Table 2. Other applica- 
tions which will normally be installed on the client computer 
in conjunction with Astra include a standard Web browser 
(FIGS. 11 and 12), and one or more editors (not shown) for 

20 editing URL content. 

TABLE 2 

PLUG-IN FUNCTION PERFORMED 

25 Link 
Doctor 
Action 
Tracker 



30 Test 
World 
Load 
Wizard 



35 



Search 
Meter 



Fixes broken links automatically 

Retrieves and evaluates server access log files to generate 
Web site activity data (such as activity levels on individual 
links), and superimposes such data on site map in a user- 
adjustable manner. 

Generates and drives tests automatically 

Utilizes Action Tracker activity data to automatically 
generate test scenarios for the load -testing of Web sites with 
Mercury Interactive *s LoadRunner ™ and SiteTesl ™ soft- 
ware packages. (In the implementation described herein, the 
Load Wizard functionality is included within the Action 
Tracker plug- in.) 

Displays search engine results visually. 



40 



45 



60 



65 



The Astra API allows external client applications, such as 
the plug- in applications 96 shown in FIG. 7, to communicate 
with the Astra core 94 in order to form a variety of tasks. Via 
this API, client applications can perform the following types 
of operations: 

1. Superimpose graphical information on the site map; 

2. Access information gathered by the Astra scanning 
engine in order to generate Web site statistics; 

3. Attach custom attributes to the site map, and to indi- 
vidual nodes and links of the site map; 

4. Access some or all of a Web page's contents (HTML) 
during the Web site scanning process; 

5. Embed the Astra GUI within the client application; 

6. Add menu items to the Astra menu; and 

7. Obtain access to network functionality. 

The specific objects and methods associated with the API are 
discussed below with reference to FIG. 8. In addition, a 
complete listing of the API is included in the microfiche 
appendix as Appendix B. 

During the Web site scanning process, the Astra core 94 
communicates over the Internet 110 (or an intranet) with the 
one or more Web server applications 112 ("Web servers") 
which make up the subject Web site 113. The Web servers 
112 may, for example, run on a single computer, run on 
multiple computers located at a single geographic location 
(which may, but need not, be the location of the client 
computer 92), or run on multiple computers that are geo- 
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graphically distributed. In addition, the Web servers 112 of 
the Web site 113 may be virtually distributed across multiple 
Internet domains. 

As is conventional with Internet applications, the Astra 
core 94 uses the TCP/IP layer 108 of the computer's 5 
operating system to communicate with the Web site 113. 
Any one or more of the Astra plug-ins 96 may also use the 
TCP/IP layer 108 to communicate with the Web site 113. In 
the preferred embodiment, for example, the Action Tracker 
plug-in communicates with the Web sites (via the Astra 
plug-in API) to retrieve server access log files for perform- 
ing Web site activity analyses. 

FIG. 8 illustrates the object model used by the Astra API. 
As illustrated, the model includes six classes of objects, all 
of which are implemented as OLE Automation objects. By 
name, the six object classes are Astra, Site Graph, Edges, 1S 
Edge, Nodes, and Node. The Astra object 94 is an applica- 
tion object, and corresponds generally to the Astra core 94 
shown in FIG. 7. The Astra object 94 accesses and manipu- 
lates data stored by a Site Graph object 114. Each Site Graph 
object corresponds generally to a map of a Web site, and 20 
includes information about the URLs and links (including 
links not displayed in the Visual Web Display view) of the 
Web site. The site-specific data stored by the Site Graph 
object 114 is contained within and managed by the Edges, 
Edge, Nodes and Nodes objects, which are subclasses of the 2 s 
Graph object. 

Each Node object 115 represents a respective node (URL) 
of the site map, and each Edge object 116 represents a 
respective link between two URLs (nodes) of the map. 
Associated with each Node object and each Edge object is 30 
a set of attributes (not shown), including display attributes 
which specify how the respective object is to be represented 
graphically within the site map. For example, each Node 
object and each Edge object include respective attributes for 
specifying the color, visibility, size, screen position, and an 35 
annotation for the display of the object. These attributes can 
be manipulated via API calls to the methods supported by 
these objects 115, 116. For example, the Astra plug-ins (FIG. 
7) can manipulate the visibility attributes of the Edge objects 
to selectively hide the corresponding links on the screen. ^ 
(This feature is illustrated below in the description of the 
Action Tracker plug- in.) In addition, the Astra API includes 
methods for allowing the plug -ins to define and -attach 
custom attributes to the Edge and Node objects. 

The Nodes and Edges objects 118, 119 are container 45 
objects which represent collections of Node objects 115 and 
Edge objects 116, respectively. Any criterion can be used by 
the applications for grouping together Node objects and 
Edge objects. As depicted in FIG. 8, a single Graph object 
114 may include multiple Nodes objects 118 and multiple 50 
Edges objects 119. 

The methods of the Astra plug-in API generally fall into 
five functional categories. These categories, and the objects 
to which the associated methods apply, are listed below. 
Additional information on these methods is provided in the 55 
API listing in Appendix B. 

ASTRA GUI METHODS. These methods control various 
aspects of the Astra GUI, such as adding, deleting, 
enabling and disabling Astra menu items. Supporting 
objects: Astra, Site Graph. 60 
GROUPING AND ACCESS METHODS. These methods 
permit groupings of nodes and links to be formed, and 
permit the nodes and links within these groups to be 
accessed. Supporting objects: Site Graph, Nodes, 
Edges. 65 
NODE/EDGE APPEARANCE METHODS. These meth- 
ods provide control over display attributes (visibility, 
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color, etc.) of links and nodes of the map. Supporting 
Objects: Node, Edge. 
ATTRIBUTE ATTACHMENT METHODS. These meth- 
ods permit the attachment of custom information to 
specific objects, and provide access to such informa- 
tion. Supporting objects: Site Graph, Node, Edge. 
Example use: Number of "hits" displayed by Action 
Tracker. 

SCAN-TIME CONTENT ACCESS METHODS. These 
methods provide access by applications to Web page 
content retrieved during the scanning process. Support- 
ing Objects: Site Graph, Node. Example use: At scan 
time, textual content of each page is passed to a spell 
checker application to perform a site -wide spell check. 
As will be appreciated from the foregoing, the Astra 
architecture provides a highly extensible mapping frame- 
work which can be extended in functionality by the addition 
of new plug-ins applications. Additional aspects of the 
architecture are specified in the API description of Appendix 
B. 

VI. Scanning Process (FIGS. 9 and 10) 

As will be apparent, the terms "node" and "link" are used 
in portions of the remaining description to refer to their 
corresponding object representations — the Node object and 
the Edge object. 

The multi-threaded scanning process used by the Astra 
core 94 for scanning and mapping a Web site will now be 
described with reference to FIGS. 9 and 10. As depicted in 
FIG. 9, Astra uses two types of threads to scan and map the 
Web site: a main thread 122 and multiple lower-level 
scanning threads 122. The use of multiple scanning threads 
provides the significant benefit of allowing multiple server 
requests to be pending simultaneously, which in-turn 
reduces the time required to complete the scanning process. 
A task manager process (not shown) handles issues related 
to the management of the threads, including the synchroni- 
zation of the scanning threads 120 to the main thread 120, 
and the allocation of scanning threads 122 to operating 
system threads. 

The main thread 120 is responsible for launching the 
scanning threads 122 on a URL-by-URL basis, and uses the 
URL-specific information returned by the scanning threads 
122 to populate the Site Graph object 114 ("Site Graph") 
with the nodes, links, and associated information about the 
Web site 113. In addition, as pictorially illustrated by the 
graph and map symbols in box 114, the main thread 120 
periodically applies the Solar Layout method to the nodes 
and links of the Site Graph 114 to generate a map data 
structure which represents the Visual Web Display map of 
the Web site, (As described below, this map data structure is 
generated by manipulating the display attributes of the Node 
objects and Edge objects, and does not actually involve the 
generation of a separate data structure.) 

Upon initiation of the scanning process by the user, the 
main thread 120 obtains the URL (address) of the home page 
(or the URL of some other starting location) of the Web site 
to be scanned. If the scanning process is initiated by select- 
ing the "Automatic Update" option, the main thread 120 
obtains this URL from the previously-generated Site Graph 
114. Otherwise, the user is prompted to manually enter the 
URL of the home page. 

Once the home page URL has been obtained, the main 
thread 120 launches a scanning thread 122 to scan the 
HTML home page. As the HTML document is returned, the 
scanning thread 122 parses the HTML to identify links to 
other URLs, and to identify other predetermined HTML 
elements (such as embedded forms) used by Astra. (As 
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described below with reference to FIG. 10, if an Automatic 
Update is being performed, the scanning thread downloads 
the home page only if the page has been modified since the 
last scanning of the URL; if no download of the page is 
required, this outgoing link information is extracted from the 
previously-generated Site Graph 114.) In addition, the scan- 
ning thread 122 extracts certain information from the header 
of the HTML document, including the date/time of last 
modification, and the other information displayed in the List 
View window 78 of FIG. 4. The link and header information 
extracted by the scanning thread 122 is represented in FIG. 
9 by one of the boxes 130 labeled "URL data." 

Upon completion, the scanning thread 122 notifies the 
main thread 120 that it has finished scanning the home page. 
The main thread then reads the URL data extracted by the 
scanning thread 122 and stores this data in the Site Graph 
114 in association with a Node object which represents the 
home page URL. In addition, for each internal link (i.e., link 
to a URL within the same Internet domain) identified by the 
scanning thread 122, the main thread 120 creates (or 
updates) a corresponding Edge object and a corresponding 
Node object within the Site Graph 114, and launches a new 
scanning thread 122 to read the identified URL. (Edge and 
Node objects are also created for links to external URLs, but 
these external URLs are not scanned in the default mode.) 
These newly-launched scanning threads then proceed to 
scan their respective URLs in the same manner as described 
above (with the exception that no downloading and parsing 
is performed when the subject URL is a non-HTML file). 
Thus, scanning threads 122 are launched on a URL-by-URL 
basis until either all of the URLs of the site have been 
scanned or the user halts the scanning process. Following the 
completion of the scanning process, the Site Graph 114 fully 
represents the site map of the Web site, and contains the 
various URL-specific information displayed in the Astra List 
View window 78 (FIG. 4). When the user saves a site map 
via the Astra GUI, the Site Graph 114 is written to disk. 

In a default mode, links to external URLs detected during 
the scanning process are displayed in the site map using the 
"not scanned" icon ( 192 in FIG. 13), indicating that these 
URLs have not been verified. If the user selects a "verify 
external links" scanning option prior to initiating the scan- 
ning process, Astra will automatically scan these external 
URLs and update the map accordingly. 

As part of the HTML parsing process, the scanning 
threads 122 detect any forms that are embedded within the 
HTML documents. (As described below, these forms are 
commonly used to allow the user to initiate back-end 
database queries which result in the dynamic generation of 
Web pages.) When a form is detected during an Automatic 
Update operation, the main thread 120 checks the Site Graph 
114 to determine whether one or more datasets (captured by 
Astra as part of the Dynamic Scan feature) have been stored 
in association with the HTML document. For each dataset 
detected, Astra performs a dynamic page scan operation 
which involves the submission of the dataset to the URL 
specified within the form. This feature is further described 
below under the heading SCANNING AND MAPPING OF 
DYNAMICALLY-GENERATED PAGES. 

Once the entire Web site has 113 been scanned, the Site 
Graph 114 represents the architecture of the Web site, 
including all of the detected URLs and links of the site. (If 
the user pauses the scanning process prior to completion, the 
Site Graph and VWD map represent a scanned subset of the 
Web site.) As described above, this data structure 114 is in 
the general form of a list of Node objects (one per URL) and 
Edge objects (one per link), with associated information 
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attached as attributes of these objects. For each URL of the 
site, the information stored within the Site Graph typically 
includes the following: the URL type, the scanning status 
(OK, not found, inaccessible, unread, or access denied), the 
5 data and time of last modification, the URLs (addresses) of 
all incoming and outgoing links, the file size (if the URL was 
actually retrieved), an annotation, and the associated proto- 
col. 

Periodically during the scanning process, the main thread 

10 120 executes a Visual Web Display routine which applies the 
Solar Layout method to the URLs and links of the Site 
Graph 114. (The term "routine," as used herein, refers to a 
functionally -distinguishable portion of the executable code 
of a larger program or software package, but is not intended 

15 to imply the modularity or callability of such code portion.) 
As indicated above, this method selects the links to be 
displayed within the site map (by selecting a span tree from 
the graph structure), and determines the layout and size for 
the display of the nodes (URLs) and non-hidden links of the 

20 map. The execution of this display routine results in modi- 
fications to the display attributes of the nodes (Node objects) 
and links (Edge objects) of the Site Graph 114 in accordance 
with the above-described layout and display principles. For 
example, for each link which is not present in the span tree, 

25 the visibility attribute of the link is set to "hidden." (As 
described below, link and node attributes are also modified 
in response to various user actions during the viewing of the 
map, such as the application of filters to the site map.) 
In the preferred embodiment, the Visual Web Display 

30 routine is executed each time a predetermined threshold of 
new URLs have been scanned. Each time the routine is 
executed, the screen is automatically updated (in Visual Web 
Display format) to show the additional URLs that have been 
identified since the last execution of the routine. This allows 

35 the user to view the step-by-step generation of the site map 
during the scanning process. The user can selectively pause 
and restart the scanning process using respective controls on 
the Astra toolbar 46. 
FIG. 10 illustrates the general decision process followed 

40 by a scanning thread 122 when a URL is scanned. This 
process implements the above-mentioned caching scheme 
for reducing unnecessary downloads of URLs and URL 
headers during Automatic Update operations. With reference 
to decision block 140, it is initially determined whether the 

45 URL has previously been scanned. If it has not been 
scanned, the thread either requests the file from the server (if 
the URL is an HTML file), or else requests the URL's header 
from the server, as illustrated by blocks 142-146. (URL 
headers are retrieved using the HEAD method of the HTTP 

50 protocol.) In either case, the scanning thread waits for the 
server to respond, and generates an appropriate status code 
(such as a code indicating that the URL was not found or was 
inaccessible) if a timeout occurs or if the server returns an 
error code, as indicated by block 150. 

55 If, on the other hand, the URL has previously been 
mapped (block 140), the date/time of last modification 
stored in the Site Graph 114 (FIG. 9) is used to determine 
whether or not a retrieval of the URL is necessary. This is 
accomplished using standard argument fields of the HTTP 

60 "GET" method which enable the client to specify a "date/ 
time of last modification" condition for the return of the file. 
With reference to blocks 158 and 160, the GET request is for 
the entire URL if the file is an HTML file (block 158), and 
is for the URL header if the file is a non-HTML file (block 

65 160). Referring again to block 150, the thread then waits for 
the server response, and returns an appropriate status code if 
an error occurs. 
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As indicated by block 164, if an HTML file is returned as 
the result of the server request, the scanning thread parses 
the HTML and identifies any links within the file to other 
URLs. As indicated above, the main thread 120 launches 
additional scanning threads 122 to scan these URLs if any 
links are detected, with the exception that external links are 
not scanned unless a 'Verify external links" option has been 
selected by the user. 

As indicated by the foregoing, the scanning process of the 
present invention provides a high degree of bandwidth 
efficiency by avoiding unnecessary retrievals of URLs and 
URL headers that have not been modified since the previous 
mapping, and by using multiple threads to scan the Web site. 
VII. Scanning and Mapping of Dynamically-Generated 
Pages (FIGS. 11-15) 

A feature of the invention which permits the scanning and 
mapping of dynamically-generated Web pages will now be 
described. By way of background, a dynamically-generated 
Web page ("dynamic page") is a page that is generated 
"on-the-fly" by a Web site in response to some user input, 
such as a database query. Under existing Web technology, 
the user manually types-in the information (referred to 
herein as the "dataset") into an embedded form of an HTML 
document while viewing the document with a Web browser, 
and then selects a "submit" type button to submit the dataset 
to a Web site that has back-end database access or real-time 
data generation capabilities. (Technologies which provide 
such Web server extension capabilities include CGI, 
Microsoft's ISAPI, and Netscape's NSAPI.) A Web server 
exiension module (such as a CGI script) then processes the 
dataset (by, for example, performing a database search, or 
generating real-time data) to generate the data to be returned 
to the user, and the data is returned to the browser in the form 
of a standard Web page. 

One deficiency in existing Web site mapping programs is 
that they do not support the automatic retrieval of dynamic 
pages. As a result, these mapping programs are not well 
suited for tracking changes to back-end databases, and do 
not provide an efficient mechanism for testing the function- 
ality of back-end database search components. In accor- 
dance with one aspect of the invention, these deficiencies are 
overcome by providing a mechanism for capturing datasets 
entered by the user into a standard Web browser, and for 
automatically re-submitting such datasets during the updat- 
ing of site maps. The feature of Astra which provides these 
capabilities is referred to as Dynamic Scan. 

FIG. 11 illustrates the general flow of information 
between components during a Dynamic Scan capture 
session, which can be initiated by the user from the Astra 
tool bar. Depicted in the drawing is a client computer 92 
communicating with a Web site 113 over the Internet 110 via 
respective TCP/IP layers 108, 178. The Web site 113 
includes a Web server application 112 which interoperates 
with CGI scripts (shown as layer 180) to generate Web pages 
on-the-fly. Running on the client computer 92 in conjunction 
with the Astra application 94 is a standard Web browser 170 
(such as Netscape Navigator or Microsoft's Internet 
Explorer), which is automatically launched by Astra when 
the user activates the capture session. As illustrated, the Web 
browser 170 is configured to use the Astra application 94 as 
an HTTP- level proxy. Thus, all HTTP-level messages (client 
requests) generated by the Web browser 170 are initially 
passed to Astra 94, which in -turn makes the client requests 
on behalf of the Web browser. Server responses (HTML 
pages, etc.) to such requests are returned to Astra by the 
client computer's TCP/IP layer 108, and are then forwarded 
to the browser to maintain the impression of normal brows- 
ing. 
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During the Dynamic Scan capture session, Ihe user types- 
in data into one or more fields 174 of an HTML document 
172 while viewing the document with the browser 170. The 
HTML document 172 may, for example, be an internal URL 

5 which is part of a Web site map, or may be an external URL 
which has been linked to the site map for mapping purposes. 
When the user submits the form, Astra extracts the 
manually-entered dataset, and stores this dataset (in asso- 
ciation with the HTML document 172) for subsequent use. 

to When Astra subsequently re -scans the HTML document 172 
(during an Automatic Update of the associated site map), 
Astra automatically retrieves the dataset, and submits the 
dataset to the Web site 113 to recreate the form submission. 
Thus, for example, once the user has typed-in and submitted 

15 a database query in connection with a URL of a site map, 
Astra will automatically perform the database query (and 
map the results, as described below) the next time an 
Automatic Update of the map is performed. 

With further reference to FIG. 11, when the Web site 113 

20 returns the dynamic page during the capture session (or 
during a subsequent Automatic Update session), Astra auto- 
matically adds a corresponding node to the site map, with 
this node being displayed as being linked to the form page. 
(Screen displays taken during a sample capture session are 

25 shown in FIGS. 13-15 and are described below.) In addition, ■ 
Astra parses the dynamic page, and adds respective nodes to 
the map for each outgoing fink of the dynamic page. (In the 
default setting, these outgoing links are not scanned.) Astra 
also parses any static Web pages that are retrieved with the 

30 browser during the Dynamic Scan capture session, and 
updates the site map (by appending appropriate URL icons) 
to reflect the static pages. 

FIG. 12 illustrates the general flow of information during 
a Dynamic Scan capture session, and will be used to 

35 describe the process in. greater detail. Labeled arrows in 
FIG. 12 represent the flow of information between software 
and database components of the client and server computers. 
As will be apparent, certain operations (such as updates to 
the map structure 128) need not be performed in the order 

40 shown. 

Prior to initiating the Dynamic Scan session, the user 
specifies a page 172 which includes an embedded form. 
(This step is not shown in FIG. 12). This can be done by 
browsing the site map with the Astra GUI to locate the node 

45 of a form page 172 (depicted by Astra using a special icon), 
and then selecting the node with the mouse. The user then 
initiates a Dynamic Scan session, which causes the follow- 
ing dialog to appear on the screen: YOU ARE ABOUT TO 
ENTER DYNAMIC SCAN MODE. IN THIS MODE YOU 

50 WORK WITH A BROWSER AS USUAL, BUT ALL 
YOUR ACTIONS (INCLUDING FORM SUBMISSIONS) 
ARE RECORDED IN THE SITE MAP. TO EXIT FROM 
THIS MODE, PRESS THE "STOP DYNAMIC SCAN" 
BUTTON ON THE MAIN TOOLBAR OR CHOOSE THE 

55 "STOP DYNAMIC SCAN" OPTION IN THE SCAN 
MENU. 

When the user clicks on the "OK" button, Astra modifies 
the configuration of the Web browser 170 within the registry 
182 of the client computer to set Astra 94 as a proxy of the 

60 browser, as illustrated by arrow A of FIG. 12. (As will be 
recognized by those skilled in the art, the specific modifi- 
cation which needs to be made to the registry 182 depends 
upon the default browser installed on the client computer.) 
Astra then launches the browser 170, and passes the URL 

65 (address) of the selected form page to the browser for 
display. Once the browser has been launched, Astra modifies 
the registry 182 (arrow B) to restore the original browser 
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configuration. This ensures that the browser will not attempt 
to use Astra as a proxy on subsequent browser launches, but 
does disable the browser's use of Astra as a proxy during the 
Dynamic Scan session. 

As depicted in FIG. 12, the browser 170 retrieves and 
displays the form page 172, enabling the user to complete 
the form. In response to the submission by the user of the 
form, the browser 170 passes an HTTP-level (GET or 
POST) message to Astra 94, as indicated by arrow C. This 
message includes the daiaset entered by the user, and speci- 
fies the URL (address) of the CGI script or other Web server 
extension component 180 to which the form is addressed. 
Upon receiving this HTTP message, Astra displays the 
dialog "YOU ARE ABOUT TO ADD A DATA SET TO THE 



page selected, and then typing in the word "school" into the 
query field 194 of the page. (Intermediate displays generated 
by Astra during the Dynamic Scan session are omitted.) As 
illustrated in the figure, the Web browser comes up within a 
window 196, allowing the user to access the Astra controls 
and view the site map 190 during the Dynamic Scan session. 

FIG. 15 illustrates the updated map 190' generated by 
Astra as a result of the FIG. 14 database query. The node 
(icon) 200 labeled "titles" in the map represents the dynamic 
page returned by the Infoscek™ Web site, and is depicted by 
Astra as being linked to the Infoseek™ form page. A special 
"dynamic page" icon 200 is used to represent this newly- 
added node, so that the user can readily distinguish the node 
from nodes representing statically-generated pages. The 
children 204 of the dynamic page node 200 represent 



CURRENT URL IN THE SITE MAP," and presents the user 15 outgoing links from the dynamic page, and are detected by 



with an "OK" button and a "CANCEL" button 

Assuming the user selects the OK button, Astra extracts 
the dataset entered by the user and then forwards the 
HTTP-level message to its destination, as illustrated by 
arrow E. In addition, as depicted by arrow D, Astra stores 20 
this dataset in the Site Graph 114 in association with the 
form page 172. As described above, this dataset will auto- 
matically be retrieved and re -submitted each time the form 
page 172 is re -scanned as part of an Automatic Update 
operation. With reference to arrows F and G, when the Web 25 
server 112 returns the dynamic page 184, Astra 94 parses the 
page and updates the Site Graph 114 to reflect the page and 
any outgoing links of the dynamic page. (In this regard, 
Astra handles the dynamic page in the same manner as for 
other HTML documents retrieved during the normal scan- 
ning process.) In addition, as depicted by arrow H, Astra 
forwards the dynamic page 184 to the Web browser 170 
(which in-turn displays the page) to maintain an impression 
of normal Web browsing. 



Astra by parsing the HTML of the dynamic page. In the 
present example, at least some of the children 204 represent 
search results returned by the Infoseek search engine and 
listed in the dynamic page. 

As generally illustrated by FIG. 15, in which the children 
204 of the dynamic page 200 are represented with Astra's 
"not scanned," Astra does not automatically scan the chil- 
dren of the dynamically-generated Web page during the 
Dynamic Scan session. To effectively scan a child page 204, 
the user can retrieve the page with the browser during the 
Dynamic Scan session, which will cause Astra to parse the 
child page and update the map accordingly. 

Following the sequence illustrated by FIGS. 13-15, the 
user can, for example, save the map 190' to disk, which will 
30 cause the corresponding Site Graph 114 to be written to disk. 
If the user subsequently retrieves the map 190' and initiates 
an Automatic Update operation, Astra will automatically 
submit the query "school" to the Infoseek™ search engine, 



and update the map 190' to reflect the search results returned. 

Following the above'sequence, the user can select the 35 (Children 204 which do not come up in this later search will 

"stop dynamic scan" button or menu option to end the not be displayed in the updated map.) By comparing this 

capture session and close the browser 170. Alternatively, the updated map to the original map 190' (either manually or 

user can continue the browsing session and make additional using Astra's map comparison tool), the user can readily 

updates to the site map. For example, the user can select the identify any new search result URLs that were returned by 

"back" button 186 (FIG. 14) of the browser to go back to the 40 the search engine. 

form page and submit a new dataset, in which case Astra will While the above-described Dynamic Scan feature is par- 
record the dataset and resulting page in the same manner as ticularly useful in Web site mapping applications, it will be 
described above. recognized that the feature can also be used to in other types 
Although the system of the preferred embodiment utilizes of applications. For example, the feature can be used to 
conventional proxy technology to redirect and monitor the 45 permit the scanning of dynamically-generated pages by 



50 



output of the Web browser 170, it will be recognized that 
other technologies and redirection methods can be used for 
this purpose. For example, the output of the Web browser 
could be monitored using conventional Internet firewall 
technologies, 

FIGS. 13-15 are a sequence of screen displays taken 
during a Dynamic Scan capture session in which a simple 
database query was entered into a search page of the 
Infoseek™ search engine. FIG. 13, which is the first display 
screen of the sequence, illustrates a simple map 190 gener- 55 
ated by opening a new map and then specifying http:// 
www.infoseek.com/ as the URL. Displayed at the center of 
the map is the form page icon for the Infoseek™ search 
page. The 20 children 192 of the form page icon correspond 
to external links (i.e., links to URLs outside the infoseek- 
.com domain), and are therefore displayed using the "not 
scanned" icon. (As described above, if the "verify external 
links" option of Astra is selected, Astra will verify the 
presence of such external URLs and update the map 
accordingly.) 

FIG. 14 illustrates a subsequent screen display generated 
by starting a Dynamic Scan session with the Infoseek™ 



general purpose Webcrawlers. In addition, although the 
feature is implemented in the preferred embodiment such 
that the user can use a standard, stand-alone Web browser, 
it will be readily apparent that the feature can be imple- 
mented using a special "built-in" Web browser that is 
integrated with the scanning and mapping code. 
VIII. Display of Filtered Maps (FIGS. 16-18) 

The content, status and location filters of Astra provide a 
simple mechanism for allowing the user to focus-in on 
URLs which exhibit particular characteristics, while making 
use of the intuitive layout and display methods used by Astra 
for the display of site maps. To apply a filter, the user simply 
selects the corresponding filter button on the filter toolbar 47 
while viewing a site map. (The specific filters that are 
available within Astra are listed above under the heading 
ASTRA GRAPHICAL USER INTERFACE.) Astra then 
automatically generates and displays a filtered version of the 
map. In addition to navigating the filtered map using Astra's 
navigation controls, the user can select the Visual Web 
65 Display button 73 (FIG. 16) to view the filtered map in 
Astra's VWD format. Combinations of the filters can be 
applied to the site map concurrently. 



60 
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FIG. 16 illustrates the general display format used by 
Astra when a filter is initially applied to a site map. This 
example was generated by selecting the "hide OK URLs" 
button 220 on the filter toolbar 47 while viewing a site map 
similar to the map 30 of FIG. 1. As illustrated by the screen 
display, the selection of the filter causes Astra to generate a 
filtered map 30* which is in the form of skeletal view of the 
original map, with only the links and URLs of interest 
remaining. 

As generally illustrated by FIG. 16, the filtered map 30* 
consists primarily of the following components of the origi- 
nal map 30: (i) the URLs which satisfy (pass through) the 
filter, (ii) the links to the URLs which satisfy the filter, and 
(iii) all "intermediate" nodes and links (if any) needed to 
maintain connectivity between the root (home page) URL 
and the URLs which satisfy the filter. (This display meth- 
odology is used for all of the filters of the filter toolbar 47, 
and is also used when multiple filters are applied.) In this 
example, the filtered map 30' thus consists of the home page. 
URL, all URLs which have a scanning status other than 
"OK," and the links and nodes needed to maintain connec- 
tivity to the non-OK URLs. To allow the user to readily 
distinguish between the two types of URLs, Astra displays 
the URLs which satisfy the filter in a prominent color (such 
as red) when the filtered map is viewed in a zoomed -out 
mode. The general process used by Astra to generate the 
skeletal view of the filtered map is illustrated by FIG. 17. 

While viewing the filtered map, the user can perform any 
of a number of actions, such as zoom in and out to reveal 
additional URL information, launch editor programs to edit 
the displayed URLs, and apply additional filters to the map. 
In addition, the user can select the Visual Web Display 
button 73 to display the filtered map in Astra's VWD format. 
To restore the hidden nodes and links to the map, the user 
clicks on the selected filter button to remove the filter. 

FIG. 18 illustrates the filtered map of FIG. 16 following 
selection by the user of the VWD button 73. As generally 
illustrated by these two figures, the selection of the VWD 
button 73 causes Astra to apply the Solar Layout method to 
the nodes and links of the filtered map. In addition, to 
provide the user with a contextual setting for viewing the 
remaining URLs, Astra restores the visibility of selected 
nodes and links in the immediate vicinity of the URLs that 
satisfy the filter. As generally illustrated by node icons 226, 
228 and 230 in FIG. 18, an icon color coding scheme is used 
to allow the user to distinguish the URL icons which satisfy 
the filter from those which do not, and to allow the user to 
distinguish URLs which have not been scanned. 
LX. Tracking and Display of Visitor Activity (FIGS. 19 and 
20) 

An important feature of Astra is its the ability to track user 
(visitor) activity and behavior patterns with respect to a Web 
site and to graphically display this information (using color 
coding, annotations, etc.) on the site map. In the preferred 
embodiment, this feature is implemented in-part by the 
Action Tracker plug-in, which gathers user activity data by 
retrieving and analyzing server log files commonly main- 
tained by Web servers. Using this feature, Webmasters can 
view site maps which graphically display such information 
as: the most frequently-accessed URLs, the most heavily 
traveled links and paths, and the most popular site entry and 
exit points. As will be appreciated by those skilled in the art, 
the ability to view such information in the context of a site 
map greatly simplifies the task of evaluating and maintain- 
ing Web site effectiveness. 

By way of background, standard Web servers commonly 
maintain server access log files ("log files") which include 
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information about accesses to the Web site by users. These 
files are typically maintained in one of two standard formats: 
the HTTP Server Access Log File format, or the HTTP 
Server Referrer Log File format. (Both of these formats are 
5 commonly used by Web servers available from Microsoft, 
Netscape, and NSCA, and both formats are supported by 
Astra.) Each entry (line) in a log file represents a successful 
access to the associated Web site, and contains various 
information about the access event. This information nor- 
mally includes: the path to the accessed URL, an identifier 
of the user (typically in the form of an IP address), and the 
date and time of the access. Each log file stored on a physical 
server typically represents some window of lime, such as 
one month. 

In accordance with the invention, Astra uses the informa- 

15 tion contained within a log file in combination with the 
associated site graph to determine probable paths taken by 
visitors to the Web site. (The term "visitor" is used herein to 
distinguish the user of the Web site from the user of Astra, 
but is not intended to imply that the Web site user must be 

20 located remotely from the Web site.) This generally involves 
using access dateAime stamps to determine the chronologi- 
cal sequence of URLs followed by each visitor (on a 
visitor-by -visitor basis), and comparing this information 
against link information stored in the site map (i.e., the Site 

25 Graph object 114) to determine the probable navigation path 
taken between the accessed URLs. (This method is 
described in more detail below.) By determining the navi- 
gation path followed by a visitor, Astra also determines the 
site entry and exits points taken by the visitor and all of the 

30 links traversed by the visitor. By performing this method for 
each visitor represented in the log file and appropriately 
combining the information of all of the visitors, Astra 
generates statistical data (such as the number of "hits" or the 
number of exit events) with respect to each link and node of 

35 the Web site, and attaches this information to the corre- 
sponding Node and Edge objects 115, 116 (FIG. 8) of the 
Site Graph 114. 

To activate the Action Tracker feature, the user selects the 
Action Tracker option from the TOOLS menu while viewing 

40 a site map. The user is then presented with the option of 
either retrieving the server log file or loading a previously- 
saved Astra Activity File. Astra Activity Files are com- 
pressed versions of the log files generated by Astra and 
stored locally on the client machine, and can be generated 

45 and saved via controls within the Action Tracker plug-in. 
Astra also provides an option which allows the user to 
append a log file to an existing Astra Activity file, so that 
multiple log files can be conveniently combined for analysis 
purposes. Once the Activity File or server log file has been 

so loaded, an Action Tracker dialog box (FIG, 19) opens which 
provides controls for allowing the user to selectively display 
different types of activity data on the map. In one imple- 
mentation (described separately below under the heading 
AUTOMATED GENERATION OF LOAD TESTING 

55 SCENARIOS), the Action Tracker dialog box includes a 
"Load Wizard" button for allowing the user to initiate the 
automatic generation of a load-testing scenario from the 
activity data. 

FIG. 19 illustrates the general display format used by the 
60 Action Tracker plug-in to display activity levels on the links 
of a site. As illustrated by the screen display, the links 
between URLs are displayed using a color-coding scheme 
which allows the user to associate different link colors (and 
URL icon colors) with different relative levels of user 
65 activity. As generally illustrated by the color legend, three 
distinct colors are used to represent three (respective) adja- 
cent ranges of user activity. 
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Io the illustrated display mode (uncolored links bidden, 
uncolored URLs not hidden), all of the URLs of the site map 
are displayed, but the only links that are displayed are those 
which satisfy a user-adjustable minimum activity threshold. 
Each visible link is displayed as a one-way arrow (indicating 5 
the link direction), and includes a numerical annotation 
indicating the total number of hits revealed by the log or 
activity file. The number of hits per URL can be viewed in 
List View mode in a corresponding column. As can be seen 
from an observation of the screen display, the displayed 
links include links which do not appear in the Visual Web 
Display view of the map. 

With further reference to FIG. 19, a slide control 240 
allows the user to adjust the "hits" thresholds corresponding 
to each of the three colors. By clicking and dragging the 
slide control, the user can vary the number of displayed links 15 
in a controllable manner to reveal different levels of user 
(visitor) activity. This feature is particularly useful for 
identifying congested links, which can be remedied by the 
addition of appropriate data redundancies. 

FIG. 20 illustrates the general process used by the Action 20 
Tracker plug-in to detect the link activity data (number of 
hits per link) from the log file. The displayed flow chart 
assumes that the log file has already been retrieved, and that 
the attribute "hits" has been defined for each link (Edge 
object) of the Site Graph and set to zero. As illustrated by the 25 
flow chart, the general decision process is applied line -by- 
line to the log file (each line representing an access to a 
URL) until all of the lines have been processed. With 
reference to blocks 250 and 252, each time a new line of the 
log file is ready, it is initially determined whether or not the 30 
log file reflects a previous access by the user to the Web site. 
This determination is made by searching for other entries 
within the log file which have the same user identifier (e.g., 
IP address) and an earlier date/time stamp. 

Blocks 254 and 256 illustrate the steps that are performed 35 
if the user (visitor) previously visited the site. Initially, the 
Site Graph is accessed to determine whether a link exists 
from the most-recently accessed URL to the current URL, as 
indicated by decision block 254. If such a link exists, it is 
assumed that the visitor used this link to get to the current 40 
URL, and the usage level ("hits" attribute) of the identified 
link is incremented by one. If no such link is identified 
between the most-recently accessed URL and the current 
URL, an assumption is made that the user back-tracked 
along the navigation path (by using the "BACK" button of 45 
the browser) before jumping to the current URL. Thus, 
decision step 254 is repeated for each prior access by the 
user to the site, in reverse chronological order, until either a 
link to the current URL is identified or all of the prior 
accesses are evaluated. If a link is detected during this 50 
process, the "hits" attribute of the link is incremented. 

As indicated by block 258, the above process continues on 
a line-by-line basis until all of the lines of the log file have 
been processed. Following the execution of this routine, the 
"hits" attribute of each link represents an approximation 55 
(based on the above assumptions) of the number of times the 
link was traversed during the time frame represented by the 
log file. 

As will be apparent, the general methodology illustrated 
by the FIG. 20 flow chart can be used to detect a variety of 60 
different types of activity information, which can be super- 
imposed on the site map (by modifying node and link 
display attributes) in the same general manner as described 
above. The following are examples of some of the types of 
activity data that can be displayed, together with descrip- 65 
tions of several features of the invention which relate to the 
display of the activity data: 



Exit Points. Exit points are deduced from the log file on 
a visitor-by-visitor basis by looking for the last URL 
accessed by each visitor, and by looking for large time 
gaps between consecutive accesses to the site. An 
"exits" attribute is defined for each node, to keep track 
of the total number of exit events from each node. The 
color-coding scheme described above is then used to 
allow the user to controllably display different thresh- 
olds of exit events. 

Usage Zones. When viewing a large site map in its 
entirety (as in FIG. 1), it tends to be difficult to identify 
individual URL icons within the map. This in- turn 
makes it difficult to view the color-coding scheme used 
by the Action Tracker plug-in to display URL usage 
levels. The Usage Zones feature alleviates this problem 
by enlarging the size of the colored URL icons (i.e., the 
icons of nodes which fall within the predetermined 
activity level thresholds) to a predetermined minimum 
size. (This is accomplished by increasing the "display 
size" attributes of these icons.) If these colored nodes 
are close together on the map, the enlarged icons merge 
to form a colored zone on the map. This facilitates the 
visual identification of high -activity zones of the site. 

Complete Path Display. With this feature, the complete 
path of each visitor is displayed on the map on a 
visitor-by-visitor basis, with the visitor identifier and 
the URL access time tags displayed in the List View 
window 78 (FIG. 4). This feature permits fine-grain 
inspection of the site usage data, which is useful, for 
example, for analyzing security attacks and studying 
visitor behavior patterns. 

Log Filters. Because server access log files tend to be 
large, it is desirable to be able to filter the log file and 
to display only certain types of information. This 
feature allows the user to specify custom filters to be 
applied to the log file for purposes of limiting the scope 
of the usage analysis. Using this feature, the user can, 
for example, specify specific time and date ranges to 
monitor, or limit the usage analysis to specific IP 
addresses or domains. In addition, the user can specify 
a minimum visit duration which must be satisfied 
before the Action Tracker will count an access as a 
visit. 

X. Map Comparison Tool (FIG. 21) 

FIG. 21 illustrates a screen display generated using 
Astra's Change Viewer map comparison tool. As illustrated 
by the screen display, the comparison tool generates a 
comparison map 268 which uses a color-coding scheme to 
highlight differences between two site maps, allowing the 
user to visualize the changes that have been made to a Web 
site since a prior mapping of the site. Using the check boxes 
within the Change Viewer dialog box 270, the user can 
selectively display the following: new URLs and links, 
modified URLs, deleted URLs and links, and unmodified 
URLs and links. As illustrated, each node and link of the 
comparison map is displayed in one of four distinct colors to 
indicate its respective comparison status: new, modified, 
deleted, or unmodified. 

To compare two maps, the user selects the "Compare 
Maps" option from the TOOLS menu while viewing the 
current map, and then specifies the filename of the prior 
map. Astra then performs a node -by-node and link-by-link 
comparison of the two map structures (Site Graphs) to 
identify the changes. This involves comparing the "URL" 
attributes of the associated Node and Edge objects to iden- 
tify URLs and links that have been added and deleted, and 
comparing the "date/time of last modification" attributes of 
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like Node objects (i.e., Node objects with the same "URL" this feature is described with reference to the load-testing of 

attribute) to identify URLs that have been modified. During Web sites, the scenario generation methodology can be 

this process, a comparison map data structure is generated applied to the testing of other types of server applications 

which reflects the comparison of the two maps, using color that maintain access logs. For instance, the methodology 

attributes to indicate the comparison outcomes (new, s could be used to test a mainframe terminal application for 

modified, deleted or unmodified) of the resulting nodes and which a central machine maintains a log of accesses to 

links. Once the comparison map data structure has been different screens. 

generated, Astra applies the Solar Layout method to the To facilitate an understanding of the Load Wizard feature, 

structure and displays the comparison map 268 in Astra's an overview is initially provided of the Web site load-testing 

VWD format. (The user can also view the comparison map to features of the commercially-available LoadRunner and 

in Astra's "incoming links" and "outgoing links" display Astra SiteTest products, including the scenario generation 

modes.) The user can then adjust the "show" settings in the features of these products. Additional information about 

dialog box 270, which causes Astra to traverse the compari- these features, and about the LoadRunner and Astra SiteTest 

son map data structure and adjust the visibility attributes products generally, is available from Mercury Interactive 

according to the current settings. is Corporation of Sunnyvale California. As in the above 

In one embodiment, the Astra code includes an Auto- description of the Action Tracker plug-in, the term "user" is 

mated Comparison feature which allows the user to schedule used in the following Load Wizard description to refer to 

the automatic generation of comparison maps on a periodic users of the disclosed Web site testing tools (e.g., Astra and 

basis (e.g., once a week). When this feature is used, Astra Astra SiteTest), and the term "visitor" is used to refer to 

automatically re-scans the site at the scheduled times, and 20 users of the subject Web site, 
then uses the superseded and the new map files to generate 

and save a comparison map. One variation of this feature (a) Web Site Testing with LoadRunner and SiteTest 

involves automatically sending an email containing a list of A , -uju t Jn ja , o,^ *r» j 

# . . ,, mr . U1 t , & . As described above, LoadRunner and Astra SiteTest Prod- 

the new and modified URU (and possibly other companson uc(s (hereinafter «Lo adRunner .. and "SiteTest") use pre- 

data) to . pre-spectfied mdiwdual whenever the automatic 25 load testing scripts, referred to herein as "Web 

companson takes place. The email (and/or he companson scfj „ *^ J ^ ^ 

map) can then be used by the md'v.dual for example, to consists essentially of a sequence of HTTP messages (stored 

efficiently review the additions to the Web site. ... . . . fi , \ vu . «• r . 

vt t • i r» ■ m ■ /r to within a script file), with each message representing a client 

XI. Link Repatr Plug-in (FIG. 22) w „ 7 , <: ™ c n . r r . * , 

11 -11 ♦ . .u f a . » t ■ i n ♦ ™ request to a Web site. The following is an example of a Web 

FIG. 22 illustrates the operation of Astra s Link Doctor 30 . , . # . f TmT & . * 

. . ~ ... - ; i . .l «r • i script consisting of three URL request messages: 

plug-in. To access this feature, the user selects the 'Link -i & 

Doctor" option from the TOOLS menu while viewing a site URL("http: //www. merc-int.com/forms/edu_reg.html"); 

map. The Link Doctor dialog box 284 then appears with a UR^'httpVAvww.merc-int.com/cgi-bi/index.htmr'); 

listing (in the "broken links" pane 286) of all of the broken URL("htlp://www.merc-int.com/cgi-bi/login.pr*); 

links (i.e., URLs of missing content objects) detected within 35 Form submissions and other types of requests that invoke 

the site map. (Astra detects the missing links by searching "back end" Web site components can be included, 

the Site Graph for Node objects having a status of "not In current implementations of LoadRunner and SiteTest, 

found.") When the user selects a URL from the broken links the Web scripts are generated by capturing and recording the 

pane (as illustrated in the screen display), Astra automati- output of a standard Web browser during interactive brows- 

cally lists all of the URLs which reference the missing 40 ing of the site by a user. The browser output is captured and 

content object in the "appearing in" pane 288. This allows recorded using a "Web Vuser Generator" component, which 

the user to rapidly identify all of the URLs (content objects) uses the proxy-based capture technique (illustrated in FIGS, 

that are directly affected by the broken link. 11 and 12 and described above) used with the Dynamic Scan 

In addition to listing all of the referencing URLs in the feature. The Web scripts can also be typed-in and/or edited 

"appearing in" pane 288, Astra generates a graphical display 45 manually. 

(in Astra's "incoming links" display mode) which shows the During the load testing process, Web scripts are sequen- 

selected (missing) URL 290 and all of the URLs 292 which tially played back or "run" by sequentially submitting the 

have links to the missing URL. In this example, the missing request messages to the site. This is preferably accomplished 

URL is a GIF file which is embedded within eight different using a Vuser executable. As depicted in FIG. 25, multiple 

HTML files 292. From the display shown in FIG. 22, the 50 Vusers (i.e., multiple instances of the Vuser executable) can 

user can select one of the referencing nodes 292 (by either be run simultaneously on a single workstation, with different 

clicking on its icon or its listing in the "appearing in" pane), Vusers optionally running different Web scripts. This pro- 

and then select the "Edit" button 296 to edit the HTML duces a load in which multiple client requests can (and 

document and eliminate the reference to the missing file. typically will) be pending at-a-time, as is commonly the case 

XII. Automated Generation of Load Testing Scenarios 55 when large numbers of concurrent visitors are accessing the 
(FIGS. 25-32) site. As illustrated in FIG. 25, the Vusers communicate with 

An important feature of the invention involves the ability and run under the control of the LoadRunner or SiteTest 

to automatically generate load testing scripts, and associated Controller 298 (both referred to herein simply as "the 

"scenario files," from server access log files of Web sites. In Controller"). As illustrated by the partial screen display of 

the preferred embodiment, this feature (referred to as "Load 60 FIG. 26, the Controller 298 includes a user interface that 

Wizard'*) is implemented within a modified version of the allows the user to selectively launch and monitor the Vusers, 

above-described Action Tracker plug-in, and is invoked by During the load testing process, each Vuser monitors the 

selecting a Load Wizard button 300 (FIG. 27) of the Action Web site's responses to the client requests submitted by that 

Tracker dialog box. The test scripts and scenario files Vuser, and records various performance-related characteris- 

generated via the Load Wizard feature can be used to 65 tics of these responses. These characteristics include, for 

load -test a Web site using either the LoadRunner product or example, response times to individual client requests, tim- 

the Astra SiteTest product of Mercury Interactive. Although eout events, and error events. Following the load testing 
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process, the user is presented with a set of graphical reports generate a scenario. With reference to FIG. 27, the user 

that allow the user evaluate the site's performance.. Using initially loads-in a log file (or possibly multiple log files) 

these reports, the user can, for example, compare response while viewing a map of the site. In one implementation, the 

limes of different site components (Web servers, CGI scripts, log file may be either an HTTP Server Access Log File or an 

APIs, proxy servers, etc.) to identify bottlenecks and other 5 HTTP Server Referrer Log File. Alternatively, the user may 

performance problems. load-in an Astra Activity File that was previously saved. As 

To facilitate the formulation of repeatable, multi -Vuser described above, an Astra Activity File is a condensed file 

load tests, LoadRunner and SiteTest include code for allow- format used by the Action Tracker plug-in to store server 

ing the user to define a test "scenario" that specifies the access log files. 

details of the test. A scenario may, for example, represent ten 10 Once the log file has been loaded-in, the associated site 

users that are concurrently attempting to access a particular u^gc information is displayed on the map according to the 

back-end database of a Web site. Within a scenario, Vusers current Action Tracker settings, as illustrated in FIG. 27 and 

can be arranged into groups (referred to as "Simulation described above. At this point, the user can select the Load 

Groups" or "Sgroups") that run the same Web script. For Wizard button 300 from the Action Tracker dialog box to 

example, an Sgroup of 10 Vusers can be formed to run a Web is bring up the "create site test scenario" dialog box of FIG. 28. 

script "register.txt," and an Sgroup of 5 Vusers can be p rom this d i a i og box, the user can enter the name of the 

formed to run a Web script "browse.txt." (The screen display scenario to be generated, and can specify the number of 

of FIG. 26 illustrates an Sgroup "Gl" consisting of three Vusers (up to 50) and the number of loops. The number of 

Vusers that run the Web script "WEBSITES .TXT") Once a Vusers controls the magnitude of the load to be applied to 

scenario has been defined, the scenario can be loaded and 20 me site. The number of loops controls the duration of the 

run repeatedly via the Controller 298. test. In this example, the user has specified a scenario name 

To define a scenario, the user initially uses the Web Vuser G f "TESTS, " and has selected test parameters of thirty 

Generator component (not shown) to generate the Web Vusers and ten loops. 

scripts to be included within the scenario (assuming these Qnce the user sekcts the ((QK » butt0D 304 tne 

Web scripts do not already exist). The user then invokes a 25 wizafd code implements a two-phase process (discussed 

Scenario Wizard component (not shown) to formulate the below) tQ generate the load testing scenario, which, as 

scenario. Using the Scenario Wizard, the user specifies such indicated above, is represented as a scenario file and at least 

details as the number of Vusers, the Web script (identified by Qae ^ file M illustrated in FIG. 29, a dialog box 310 is 

file name) to be run by each Vuser, and the number of theQ displayed t0 the ^ indicating the results of the 

consecutive times ("loops") that each Web script is to be run 30 automated scenario generation process. In this example, the 

during scenario execution. The user can also define one or diak)g message indicates that the scenario "TEST5" was 

more Sgroups, and can specify various testing parameters successfully generated, and that execution of the scenario 

(such a transaction timeout period) to be used during execu- wfll produce 2 2,200 hits to the Web server. From this dialog 

tion of the scenario. Once the scenario has been defined, the box no the ^ can sdect ^ «yES" button to launch the 

details of the scenario are stored in a scenario file under a 35 0,^^ and mn the scenar i 0 . 
user-specified filename. T*e scenario file and the associated Ra 30 . & & flQw m M ^ 

script file(s) fully define the scenario. flQw Qf informalion dur £ g the automated scenario genera- 

(b) Overview of Scenario Generation Process P"*e»; J** ^ that generates the scenario is rep- 

4Q resented in this figure as the Load Wizard module 320. The 

An objective of the present invention is to reduce the inputs to the Load Wizard module 320 are a list of visitor 
complexity of the above -described scenario generation routes (extracted from the server log access file), the user- 
process, including the process of generating Web scripts. specified numbers of Vusers and loops, and the scenario 
Another objective is to provide a mechanism for generating name. The number of loops and the scenario name are shown 
testing scenarios that more accurately represent load distri- ^ in parentheses to indicate that they are not part of the 
butions present during typical site usage. two-phase process used to reduce server log file information 

In accordance with these objectives, a code module is to Web scripts, 
provided that automatically generates a load testing scenario Each route within the routes list is in the general form: 
(including the scenario file and the Web script files) from URL1-URL2 (occurrence), 

information stored within server access log files. Because 5Q wherein "URU-URL2" represents a fink from URL1 to 

the process is automatic, the user does not have to record the URL2) and « 0CCU rrences" is a numerical value representing 

output of the browser, define Sgroups, and perform other ^ number 0 f times this route was taken by a visitor (as 

activities normally associated with scenario generation. represented within the server access log file). The general 

An important benefit of the process is that the resulting process used to translate the server log file data into this 

scenarios closely represent the load distributions reflected by 55 format is illustrated in FIG. 20 and described above. In one 

the server access log files. Thus, for example, if a log file implementation, the FIG. 20 process is implemented such 

indicates heavy access to a first area (e.g., URL) of a site and that all non-HTML URLs are filtered out, leaving only the 

relatively light access to a second area of the site, the routes between Web documents. As illustrated in the 

resulting scenario will stress the first area more heavily than example that follows, the scenario generation process imple- 

the second area (according to the general proportions indi- 60 mented by the Load Wizard module 320 can handle routes 

cated by the log file). Because the log files represent the that include three or more URLs. 

actual browsing behaviors of past visitors to the site (and with further reference to FIG. 30, the outputs of Load 

typically large cross sections of visitors), the load tests Wizard module 320 include one or more script files 322 

accurately emulate realistic load conditions. (each containing a respective Web script, in textual form), 

FIGS. 27-29 are partial screen displays which illustrate, 65 and a scenario file 324. The scenario file 324 is a text file that 

in example form, the sequence of events (as seen by the user) includes various control information of the scenario, includ- 

that occur when the user invokes the Load Wizard feature to ing the filenames and paths of the script files, the allocations 
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of Vusers to script files (including any Sgroup definitions), 
and the number of loops for running the scripts. The Web 
scripts and other scenario information can, of course, be 
stored in other forms. For example, the scenario information 
can be stored as one or more executable files that run during 
the load testing process. 

As depicted in FIG. 30, the Load Wizard module 320 
applies a two-phase process to the input data to generate the 
scenario. This process is illustrated in FIGS. 31 and 32, and 
is described below. Following the generation of the scenario, 
the Load Wizard module 320 launches the SiteTest or 
Loadrunner Controller 328 (assuming the user selects the 
"YES" option in ihe FIG. 29 dialog). Upon launching the 
Controller 328, the Load Wizard module 320 passes to the 
Controller the paths and filenames of the scenario and Web 
script files to allow the Controller to bring-up the scenario 
for execution. In one embodiment, the Web scripts and other 
scenario data are automatically compiled into an executable 
form prior to the running of the scenario. 

(c) Two-Phase Translation Process 

FIGS. 31 and 32 illustrate phases 1 and 2, respectively, of 
the translation process implemented by the Load Wizard 
module 320 to generate a scenario. The description of the 
process will be provided in conjunction with a simple 
example in which the number of Vusers specified by Ihe user 
is ten, and in which the routes list is as follows: 

A-B-C (100) 

B-D (200) 

A-B (250) 

C-F (80) 

In this hypothetical routes list, each alphabetic character 
(A-F) represents a unique URL of the Web site to be tested, 
and the number in parentheses is the "occurrences" value of 
the corresponding route. 

The general goal of the two-phase process is to translate 
the routes list into a set of Web scripts to be run by the 
Vusers, while preserving the general load distribution rep- 
resented by the routes list. In performing this task, the 
process seeks to generate Web scripts that are approximately 
equal in length, so that each Vuser will generate roughly the 
same number of hits during scenario execution. 

With reference to FIG. 31, the first phase of the process 
attempts to merge consecutive routes while maintaining the 
load distribution represented by the routes list. As repre- 
sented by decision block 330, a determination is initially 
made whether the sum of the occurrences (100+200+250+ 
80=630 in this example) is greater than the number of Vusers 
(ten in this example). On the first pass of the phase 1 
program loop, the sum of the occurrences will normally 
greatly exceed the number of Vusers, since the log file will 
typically represent all hits to the site over an extended period 
of time (e.g., one week). If the sum of the occurrences 
exceeds the number of Vusers, a search is conducted of the 
routes list for a pair of consecutive routes (i.e., a pair of 
routes Rl, R2 for which the last URL of Rl matches the first 
URL of R2), as indicated by decision block 332. If no such 
pair exists within the routes list, or if the sum of the 
occurrences does not exceed the number of Vusers, the 
process proceeds to phase 2 (FIG. 32), as indicated by block 
334. 

With reference to blocks 338 and 340, if a pair of 
consecutive routes (Rl, R2) is found, Rl and R2 are merged 
to form a new route which replaces the route (Rl or R2) 
having the smaller occurrence value. The new route is 
assigned the lower of the occurrence values of Rl and R2, 
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and the remaining (unreplaced) route of Rl and R2 is 
assigned an occurrence value equal to the difference 
between the occurrences of Rl and R2. In this example, the 
routes A-B-C (100) and C-F (80) are merged to form a new 
5 route A-B-C-F, which replaces route C-F (80), and this new 
route is assigned an occurrence of 80. The remaining 
(unreplaced) route, A-B-C, is assigned an occurrence of 
100-80=20. The modified routes list is thus: 

A-B-C (20) 
10 B-D (200) 

A-B (250) 

A-B-C-F (80), 

for which the sum of the occurrences is 550. As will be 

15 recognized by comparing this modified routes list to the 
original routes list, this process of merging consecutive 
routes does not affect the load distribution associated with 
the routes list. 
The above -described process of merging consecutive 

20 routes is repeated iteratively until either the sum of the 
occurrences does not exceed the number of Vusers, or no 
consecutive routes remain in the routes list. In the present 
example, the second iteration of this process results in the 
merger of routes A-B (250) and B-D (200), resulting in the 

25 following routes list: 
A-B-C (20) 
A-B-D (200) 
A-B (50) 

30 A-B-C-F (80). 

At this point, no consecutive routes remain in the list. Thus, 
the process proceeds to phase 2. 

The general function of phase 2 is to condense the routes 
list into a shorter list of longer routes (while maintaining 

35 general congruence between the lengths of the routes), and 
then to assign the resulting routes to Vusers (or Vuser 
groups) as Web scripts. Some modification to the original 
load distribution typically occurs as the result of this pro- 
cess. 

40 With reference to block 350 of FIG. 32, the first step of 
this phase 2 process involves sorting the routes list accord- 
ing to the occurrence values. This produces the following 
fist: 
A-B-C (20) 
45 A-B (50) 

A-B-C-F (80) 
A-B-D (200). 

With reference to decision block 352, if the sum of the 
occurrences is greater than the number of Vusers, the 
50 process enters into a loop (blocks 352-360), and remains in 
this loop as long as the condition is met. The first processing 
step of this loop, which is represented by block 354, involves 
locating the pair of adjacent routes for which the sum of 
route lengths is smallest (wherein route length refers to the 
55 number of URLs in the route), and then concatenating the 
two routes of the pair to form a single route. This new route 
replaces the concatenated routes, and is assigned an occur- 
rence equal to the average of the occurrences of the con- 
catenated routes. In the present example, the first iteration of 
60 this step produces a concatenation of routes A-B-C (20) and 
A-B (50), resulting in the following list: 
A-B-C-A-B (35) 
A-B-C-F (80) 
65 A-B-D (200). 

The second step of this loop, represented by block 360, 
involves dividing all of the occurrences by the smallest 
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occurrence value (unless the smallest occurrence value is prises preserving a general distribution of access requests 

one or lower). The division in this step and step 352 may be among web site content entities as reflected within the server 

either regular division (as in this example) or integer divi- access log. 

sion. The result of this step produces the following list: 3. The method as in claim 1, wherein processing the 

A-B-C-A-B (Vl 5 server access 1°S t0 identify a plurality of navigation routes 

* ^ comprises identifying first and second content entities 

A-B-C-F (2.29) accessed by a visitor in sequence, and determining whether 

A-B-D (5.71), the web site includes a navigational link between the first 

which has a sum of occurrences of 9. Because 9 is less than and second content entities. 

10 (the number of Vusers), the process exits the program 1Q 4. The method as in claim 3, further comprising incre- 

loop at step 352. menting a counter associated with the navigational link to 

With reference to block 364, the occurrence values are indicate that the visitor followed the navigational link, 

then adjusted, if necessary, to produce a set of integer values 5. The method as in claim 1, wherein processing the 

for which the sura is equal to the number of Vusers. Any server access Io S comprises determining of a number of 

scaling/truncating technique that maintains the general pro- . _ times each of a plurality of navigational links of the web site 

portions between the occurrence values may be used for this wa f f °U° wed by a visitor. 

purpose. This step may, for example, produce the following: , 6 ' ,™ c mcthod * s m c aun . ** crcm translating the 

r r r J r r plurality of navigation routes into at least one script com- 

A-B-C-A-B (1) prises merging consecutive routes of the plurality of navi- 

A-B-C-F (3) gation routes. 

A-B-D (6) 20 ^- me thod as in claim 1, wherein the script comprises 

With reference to blocks 366 and 368, each route is then a plurality of Uniform Resource Locators associated with 

outputted as a respective Web script file. In addition, the sitc - , „ , . ^ L . L 

process generates a scenario file that specifies the allocations 8 - ^ raethod » in , whercm * c method com- 

of Vusers to Web scripts. In this example, the scenario would P nscs translating the plurality of navigation routes into a 

include a single Vuser that runs the A-B-C-A-B script, a 25 scenario that comprises mulhple scripts, 

group (SgrouS of three Vusers that run the A-B-C-F script, , 9 * ™ c mcthod * s in cla t ira ?• whcrcin translating the 

and a group of six Vusers that run the A-B-D script. The P luraLt > of ™viga ion routes into a scenario composes 

• ci u i i j *u i • r „ _ • preserving a general distribution of access requests among 

scenario file would also include the loop information speci- v , . & * a ~ 4 , ... . ^ & 

a , , web site content entities as reflected within the server access 

tied by the user. . 

30 log. 

(d) Source Code Listing ^ c mclD °d as in ohim 1, wherein the server access 

log comprises a standard- format server access log file. 

Included as Appendix C is a source code listing which H. The mcthod as in claim 10, wherein the standard- 
includes an implementation of the two-phase process. The format server access log file comprises an HTTP Server 
listing also includes code for implementing the Vusers. 35 Access Log file or an HTTP Server Referrer Log file. 
XII. Conclusion 12, The method as in claim 1, further comprising running 

While certain preferred embodiments of the invention the at least one script while monitoring server response 

have been described, these embodiments have been pre- times to client request messages specified therein, 

sented by way of example only, and are not intended to limit 13. A computer-readable medium having stored thereon a 

the scope of the present invention. For example, although ^ computer program which, when executed by a computer 

the present invention has been described with reference to processes a server access log reflective of actions of a 

the standard protocols, services and components of the plurality of visitors of a web site to identity a plurality 

World Wide Web, it should be recognized that the invention 0 f navigation routes followed by said visitors of the 

is not so limited, and that the various aspects of the invention we b s jt e; an d 

can be readily applied to other types of web sites and server 45 lraDS i at es the plurality of navigation routes into at least 

applications (including, for example, mainframe terminal one scdpt that specifies a sequence of client request 

applications). Accordingly, the breadth and scope of the messages for exercising the web site, 

present invention should be defined only in accordance with u ^ computer-readable medium as in claim 13, 

the following claims and their equivalents. wherein the computer program translates the plurality of 

In the claims which follow, reference characters used to 5Q navigation routes in t 0 at least one script so as to preserve a 

designate claim steps are provided for convenience of general distribution of aocess requests among web site 

description only, and are not intended to imply any particular cont ent entities as reflected within the server access log. 

order for performing the steps. 15 ^ computer-readable medium as in claim 13, 

What is claimed is: wherein the computer program processes the server access 

1. A computer-implemented method of generating scripts 55 , og t0 identify a plurality of navigation routes by at least 
that are adapted to be played to exercise a web site, identifying first and second content entities accessed by a 
comprising: visitor in sequence, and determining whether the web site 

processing a server access log associated with the web site includes a navigational link between the first and second 

to identify a plurality of navigation routes followed by content entities. 

visitors of the web site during ordinary, post- 60 16. The computer-readable medium as in claim 15, 

deployment usage of the web site, the server access log wherein the computer program increments a counter asso- 

re fleeting actions of a plurality of said visitors; and ciated with the navigational link to indicate that the visitor 

translating the plurality of navigation routes into at least followed the navigational link, 

one script that specifies a sequence of client request 17. The computer-readable medium as in claim 13, 

messages for exercising the web site. 65 wherein the computer program uses the server access log to 

2. The method as in claim 1, wherein translating the determine a number of times each of a plurality of naviga- 
plurality of navigation routes into at least one script com- tional links of the web site was followed by a visitor. 



08/12/2003, EAST Version: 1.04.0000 



US 6,549,944 Bl 

39 40 

18. The computer-readable medium as in claim 13, distribution of access requests among web site content 
wherein the computer program translates the plurality of entities as reflected within the server access log is 
navigation routes into at least one script by at least merging preserved; and 

consecutive routes of the plurality of navigation routes. executing the plurality of scripts to exercise the web site 

19. The computer-readable medium as in claim 13, 5 while monitoring web site performance. 

wherein the script comprises a plurality of Uniform 2fi ^ method ^ ^ daim 25> wherein cessi the 

Resource locators associated with the web site. ^ trad nflvi ion ^ foJ . 

20. The computer-readable medium as in claim, 13, . , , . . *\ . . , 7 , f 
wherein the computer program translates the plurality of ° wed ^ v f tors dunng ordtnary, postKleployment usage of 
navigation routes into a scenario that comprises multiple 10 wel > f e »° ldeDtlf y ^ Plurahty of navigation routes 

. * followed by the visitors, and incorporating the plurality of 

21. The computer-readable medium as in claim 20, navigation routes into the plurality of scripts. 

wherein the computer program translates the plurality of 21 • The method as in claim 25, wherein processing the 

navigation routes into a scenario so as to preserve a general server access log comprises determining of a number of 

distribution of access requests among web site content 15 times each of a plurality of navigational links of the website 

entities as reflected within the server access log. was followed by a visitor. 

22. The computer-readable medium as in claim 13, 28. The method as in claim 25, wherein each script 
wherein the server access log comprises a standard-format comprises a plurality of Uniform Resource Locators asso- 
server access log file. ciated with the web site. 

23. The computer- read able medium as in claim 22, 20 29. The method as in claim 25, wherein the server access 
wherein the standard -format server access log file comprises i 0 g comprises a standard- format server access log file. 

an HTTP Server Access Log file or an HTTP Server Referrer 30. The method as in claim 29, wherein the standard- 
Log file. format server access log file comprises an HTTP Server 

24. The computer-readable medium as in claim 13, further Access Log file or an HTTP Server Referrer Log file, 
comprising a computer program which runs the at least one 25 31. The method as in claim 25, wherein monitoring web 
script while monitoring server response times to client s jt e performance comprises monitoring response times to 
request messages specified therein. client requests. 

25. A computer- implemented method of evaluating the 32. The method as in claim 25, wherein executing the 
performance of a web site, comprising: plurality of scripts comprises load-testing the web site. 

processing a server access log associated with the web site 30 

to generate a plurality of scripts such that a general ***** 
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